Home of the Squeezebox™ & Transporter® network music players.
Page 4 of 6 FirstFirst ... 23456 LastLast
Results 31 to 40 of 51
  1. #31
    Senior Member
    Join Date
    Dec 2020
    Posts
    126
    That sort of rings a bell...

    You might find some interesting pointers here: https://serverfault.com/questions/73...ons-in-windows

  2. #32
    Junior Member
    Join Date
    Dec 2020
    Posts
    15
    Quote Originally Posted by fpo View Post
    I'm also experiencing dropouts. While reproducible, the exact timing of the dropouts depend on circumstances like stream buffer size and mp3 bitrate.
    When loging to a terminal with -d all=sdebug the dropouts do occur. When logging same to a file with -f <logfile> however, the dropouts are gone. When they occur, one thread in squeezelite starts using nearly 100% of one cpu core, above 95% of that in kernel mode. Following the onset of 100% cpu usage the stream buffer runs empty and then the dropout occurs. To see this I had to run squeezelite from tailored program, redirecting the error stream, logging to memory and writing it all to a file after the tests so as not to alter squeezelite's internal timing. I tested with squeezelite v1.9.8-1307 (the test versions like 1318-test3 seem to be gone).
    Good progress! I did not notice any high-CPU during the outage, but I'll take another look and do more tests. It does seem true that the "stream buffer runs empty", causing the dropout, because after 30 seconds, it just grabs another chunk and continues normally. Thus, early on, I suspected it was some kind of buffering issue.

    I have not run with sdebug level yet, but seeing this code in the stream_thread:
    LOG_SDEBUG("poll timeout");
    I think there might be some sdebug messages that will help with this issue. I ran it like this:
    squeezelite-1.9.8-1318-test2-win32 -d all=sdebug -n SqueezeLite-bogan-admin -s nas -o 1 2>err.log
    This will redirect STDERR to the err.log file. (outputting to the console causes too much text, and audio gets dropped, so it must go to a file)
    The outage occurred after 5 minutes, as expected. Looking at the logs, I see a few things going on.

    First, we see 'normal' behavior, where the streambuf is allocated, the buffer is read, and the streambuf returns to 'full' with 2097151 bytes.
    The decode_thread() reports the size of the streambuf and outputbuf, using _buf_used().
    [11:52:08.332] decode_thread:75 streambuf bytes: 1963597 outputbuf space: 172615
    [11:52:08.394] stream_thread:428 streambuf read 133554 bytes
    [11:52:08.440] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 208967
    ...
    Here we see that 1963597 + 133554 = 2097151. It continues like this:
    [11:52:08.442] decode_thread:75 streambuf bytes: 2095221 outputbuf space: 172103
    [11:52:08.501] stream_thread:428 streambuf read 1930 bytes
    [11:52:08.548] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 208455
    ...
    Various amounts are allocated, but it always returns to 2097151 bytes. It continues this way until we see it come in pairs:
    [11:52:18.017] decode_thread:75 streambuf bytes: 2095430 outputbuf space: 198215
    [11:52:18.017] decode_thread:75 streambuf bytes: 2095430 outputbuf space: 198215
    [11:52:18.124] decode_thread:75 streambuf bytes: 2095430 outputbuf space: 234567
    [11:52:18.124] stream_thread:428 streambuf read 1721 bytes
    [11:52:18.124] stream_thread:428 streambuf read 1826 bytes
    [11:52:18.124] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 197703
    [11:52:18.233] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 234055
    ....
    Looks fishy. Why is it now reporting decode_thread() twice?
    Also, I see that the numbers no longer add-up. 2095430 + 1721 = 2097151, but the 1826 bytes does not add up correctly. Weird.

    It continues normally for 2 minutes, and then this is the last time we see it:
    [11:53:54.105] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 190791
    [11:53:54.213] decode_thread:75 streambuf bytes: 2097151 outputbuf space: 227143
    ...
    Now it starts to steadily decrease over the next 2 minutes:
    [11:53:54.753] decode_thread:75 streambuf bytes: 2085726 outputbuf space: 196807
    [11:53:54.862] decode_thread:75 streambuf bytes: 2085726 outputbuf space: 233159
    They come in pairs, with the same streambuf size, but the outputbuf is allocated/deallocated.
    [11:54:00.056] decode_thread:75 streambuf bytes: 1989484 outputbuf space: 170439
    [11:54:05.764] decode_thread:75 streambuf bytes: 1889334 outputbuf space: 188103
    [11:54:10.381] decode_thread:75 streambuf bytes: 1808956 outputbuf space: 202055
    [11:54:15.405] decode_thread:75 streambuf bytes: 1717460 outputbuf space: 186311
    [11:54:20.750] decode_thread:75 streambuf bytes: 1619750 outputbuf space: 205895
    [11:54:25.117] decode_thread:75 streambuf bytes: 1539989 outputbuf space: 174919
    [11:54:30.180] decode_thread:75 streambuf bytes: 1452300 outputbuf space: 214343
    [11:54:40.462] decode_thread:75 streambuf bytes: 1262312 outputbuf space: 181575
    [11:54:50.045] decode_thread:75 streambuf bytes: 1087811 outputbuf space: 198471
    [11:55:00.097] decode_thread:75 streambuf bytes: 911919 outputbuf space: 194631
    [11:55:10.065] decode_thread:75 streambuf bytes: 730428 outputbuf space: 181959
    [11:55:20.039] decode_thread:75 streambuf bytes: 558233 outputbuf space: 243015
    [11:55:30.000] decode_thread:75 streambuf bytes: 376119 outputbuf space: 221255
    [11:55:40.117] decode_thread:75 streambuf bytes: 183016 outputbuf space: 207687
    [11:55:50.088] decode_thread:75 streambuf bytes: 3614 outputbuf space: 204231
    [11:55:51.072] decode_thread:75 streambuf bytes: 1945 outputbuf space: 512711
    [11:55:53.256] decode_thread:75 streambuf bytes: 1945 outputbuf space: 1276103
    [11:55:55.241] decode_thread:75 streambuf bytes: 1945 outputbuf space: 1975879
    [11:55:57.212] decode_thread:75 streambuf bytes: 1945 outputbuf space: 2675655
    [11:55:58.304] decode_thread:75 streambuf bytes: 1945 outputbuf space: 3057351
    [11:55:59.624] decode_thread:75 streambuf bytes: 1945 outputbuf space: 3527999
    ...
    The streambuf bytes has decreased to 1945 bytes and stays there, until the audio dropout occurs a few milliseconds later.
    Next, we see that decode_thread stops and the outputbuf drops to zero. Then, we see the "strm command p" which heralds the outage:
    [11:55:59.690] sendSTAT:200 received bytesL: 4203604 streambuf: 1945 outputbuf: 0 calc elapsed: 231206 real elapsed: 231437 (diff: -231) device: 175 delay: 15
    [11:55:59.690] process:527 strm
    [11:55:59.690] process_strm:280 strm command t
    [11:55:59.690] sendSTAT:166 ms_played: 231206 (frames_played: 10203311 device_frames: 7761)
    [11:55:59.690] sendSTAT:195 STAT: STMt
    [11:55:59.690] sendSTAT:200 received bytesL: 4203604 streambuf: 1945 outputbuf: 0 calc elapsed: 231206 real elapsed: 231437 (diff: -231) device: 175 delay: 15
    [11:55:59.690] process:527 strm
    [11:55:59.691] process_strm:280 strm command p

    While the outage continues for 30 seconds, we continue to see:
    [11:56:03.745] decode_thread:75 streambuf bytes: 1945 outputbuf space: 3527999
    Until the return of audio is heralded by command q:
    [11:56:30.787] process_strm:280 strm command q
    Then, we see recovery and starting a new audio stream, the streambuf returns to a larger value:
    [11:56:31.187] decode_thread:75 streambuf bytes: 994737 outputbuf space: 3527999
    [11:56:31.187] _check_lame_header:163 gapless: skip: 2257 samples: 12759040 delay: 1105 padding: 1711
    [11:56:31.187] mad_decode:247 setting track_start
    [11:56:31.187] mad_decode:276 gapless: skipping 1152 frames at start
    [11:56:31.187] mad_decode:302 write 0 frames
    [11:56:31.187] mad_decode:235 mad_frame_decode error: lost synchronization
    [11:56:31.187] decode_thread:75 streambuf bytes: 992689 outputbuf space: 3527999
    And..
    [11:56:31.202] decode_thread:75 streambuf bytes: 991984 outputbuf space: 3527999
    [11:56:31.203] decode_thread:75 streambuf bytes: 991983 outputbuf space: 3527999
    [11:56:31.203] decode_thread:75 streambuf bytes: 991982 outputbuf space: 3527999
    [11:56:31.203] mad_decode:235 mad_frame_decode error: bad main_data_begin pointer
    [11:56:31.203] decode_thread:75 streambuf bytes: 991981 outputbuf space: 3527999
    And...
    [11:56:31.233] stream_thread:436 poll timeout

    To summarize, we see:
    * decode_thread streambuf _buf_used() declines down to zero in the lead up to the audio dropout.,
    * The dropout is heralded by the "strm command p" (pause)
    * The dropout lasts for almost exactly 30 seconds
    * The end of the dropout is heralded by the "strm command q" (quit)
    * After audio resumes, a few errors were seen:
    -- the "frame decode error" reporting lost synchronization, due to an unknown frame-decoding error.
    -- the "poll timeout". The stream_thread polls the stream to get its status, for each chunk of data it processes. Here it timed-out after 100ms.
    * The errors suggest there is an issue with the way the stream buffers are managed.

    I hope that this provides a more complete picture of the circumstances surrounding the audio dropout.

  3. #33
    Junior Member
    Join Date
    Dec 2020
    Posts
    15
    Quote Originally Posted by fpo View Post
    To me, what I found, points to a possibly spurious race condition, which may be hard to reproduce on a specific machine. E.g. I can avoid triggering the dropouts when using a big stream buffer like -b 16384. It may very well be, that error 0x2745 is a consequence of a race condition as opposed to a cause.
    I could do some more testing if you would like to give me some pointers where to look further.
    I agree it appears to be a "race condition" -- a situation where the outcome depends on timing, or whichever bit of software "wins the race". The first place to look is the stream_thread() and any static variables it uses. Unfortunately, a typical C program makes heavy use of static variables, and SqueezeLite is no different.
    I see that the stream_thread() uses a static streambuf, so it would be wise to ensure that the thread uses only thread-safe access to streambuf and other static variables. Maybe the 30-second timeout we see is related to the mutex_lock() used throughout the stream_thread. Debugging race-conditions is tough, good luck with that!

    Larger Buffer
    Ok, let me do a test with the larger buffer, as you suggest.
    squeezelite-1.9.8-1318-test2-win32 -d all=debug -b 16384 -n SqueezeLite-bogan-admin -s nas -o 1

    No dropout was heard even after 30 minutes!!

    The new buffer size affected the size reported in stream_thread()
    [13:54:43.545] decode_thread:75 streambuf bytes: 3588759 outputbuf space: 3527999

    Now it is bigger at 3.5MB, compared with 2MB before. Again, we see this value steadily decreasing during playback, until it is as small as 864 bytes.
    [13:57:46.637] decode_thread:75 streambuf bytes: 864 outputbuf space: 65095
    Previously, it took 2 minutes to use up the streambuffer, (run it down to zero) now it takes 3 minutes. Makes sense.

    Then, we see it go fetch the next track on the playlist, and playback continues:
    [13:57:47.067] decode_thread:100 decode complete
    [13:57:47.067] sendSTAT:166 ms_played: 183306 (frames_played: 8091728 device_frames: 7917)
    [13:57:47.067] sendSTAT:195 STAT: STMd
    [13:57:47.067] sendSTAT:200 received bytesL: 3588759 streambuf: 0 outputbuf: 3390976 calc elapsed: 183306 real elapsed: 183500 (diff: -194) device: 179 delay: 0
    [13:57:47.067] sendSTAT:166 ms_played: 183306 (frames_played: 8091728 device_frames: 7917)
    [13:57:47.067] sendSTAT:195 STAT: STMt
    [13:57:47.067] sendSTAT:200 received bytesL: 3588759 streambuf: 0 outputbuf: 3390976 calc elapsed: 183306 real elapsed: 183500 (diff: -194) device: 179 delay: 0
    [13:57:47.154] process:527 strm
    [13:57:47.154] process_strm:280 strm command s
    [13:57:47.154] process_strm:350 strm s autostart: 1 transition period: 3 transition type: 0 codec: m
    [13:57:47.154] sendSTAT:166 ms_played: 183400 (frames_played: 8091728 device_frames: 7917)
    [13:57:47.154] sendSTAT:195 STAT: STMf
    [13:57:47.154] sendSTAT:200 received bytesL: 3588759 streambuf: 0 outputbuf: 3390976 calc elapsed: 183400 real elapsed: 183594 (diff: -194) device: 179 delay: 94
    [13:57:47.154] codec_open:264 codec open: 'm'
    [13:57:47.154] connect_socket:164 connecting to 192.168.0.60:9002
    [13:57:47.175] stream_sock:597 header: GET /stream.mp3?player=70:85:c2:a8:ce:68 HTTP/1.0
    ...
    [13:57:47.286] decode_thread:75 streambuf bytes: 3223041 outputbuf space: 172487

    This is totally normal, no errors at all.

    Increasing the stream buffer seems to be a solid workaround, but we still don't know why the dropouts were occurring.
    If this issue is due to a resource leak, then increasing the buffer would only prolong the outage, rather than cure it.
    But so far, I don't see any signs that the issue exists when using a larger stream buffer, even now 55 minutes have gone by.

  4. #34
    Junior Member
    Join Date
    Jan 2021
    Posts
    15
    Quote Originally Posted by foopydog View Post
    ...
    The Squeezelite client has a socket open, and suddenly, there is no socket. Consider: SqueezeLite on RaspberryPi with TinyLinux has no issues, but only Windows machines have the issue. While the source code for SqueezeLite is (almost) identical between these two OS, they have very different TCP/IP transport implementations. It is therefore plausible that the root cause is in the way SqueezeLite manages its socket connections on Windows -- or the Winsock itself. (utils.c, stream.c, slimproto.c)
    For example:
    Code:
    #if WIN
    	u_long iMode = 1;
    	ioctlsocket(s, FIONBIO, &iMode);
    #else
    	int flags = fcntl(s, F_GETFL,0);
    	fcntl(s, F_SETFL, flags | O_NONBLOCK);
    #endif
    Here we can see the difference between Windows and Linux implementations.
    ...
    I have looked at a capture taken with Network Monitor. LMS (host) sends a "TCP Keep Alive" packet and Squeezelite (client) seems not to acknowledge that. Squeezelite issues "TCP Dup Ack"s and "TCP Retransmit request"s although it seems no packages where lost or out of order.
    According to https://docs.microsoft.com/en-us/win...k/so-keepalive:

    The getsockopt function called with the SO_KEEPALIVE socket option allows an application to retrieve the current state of the keepalive option, although this is feature not normally used. If an application needs to enable keepalive packets on a socket, it justs calls the setsockopt function to enable the option.

    SO_KEEPALIVE is optional under Windows. I can't see that squeezelite has SO_KEEPALIVE enabled on the socket. It seems LMS will just reset the socket after some default timespan, at least I see a "TCP Reset" frame from the host.

    In addition to that, it seems, just when the socket ist reset from the host side, squeezelite enters into the 100% cpu time mode.

  5. #35
    Junior Member
    Join Date
    Jan 2021
    Posts
    15
    If SO_KEEPALIVE is the primary issue, then the assumed race condition may not be a race condition at all.
    Just looking at the code in slimproto.c the function slimproto_run() may enter a loop with 100% cpu time if the socket is reset but not yet closed and this condition is not checked. The loop is then terminated only when the socket is finally closed.
    So it seems there are two issues

    1. SO_KEEPALIVE

    2. Given that not all possible stream states can be foreseen the main loop in slimproto_run() may check if it runs tight and enter into a sleep state for a short duration if it detects that. It may also debug-log any unknown condition.

  6. #36
    Senior Member
    Join Date
    Dec 2020
    Posts
    126
    Quote Originally Posted by fpo View Post
    SO_KEEPALIVE is optional under Windows. I can't see that squeezelite has SO_KEEPALIVE enabled on the socket. It seems LMS will just reset the socket after some default timespan, at least I see a "TCP Reset" frame from the host.
    Except that 10053 points to either the client or some network component closing the socket. If LMS closed the connection you would have seen a 10054.

    Should also note that retransmit requests are not done by the receiver (i.e. the peer waiting for some response) but the sender. Since the socket is created as "non blocking", not receiving an ACK from the server likely causes the app to go into a loop until finally giving up (and generating the 10053), which explains the 100% CPU you're seeing.

    Is your switch capable of port mirroring? If so you could bind your Pi to receive the same data as the windows machine and use tcpdump to verify what wireshark is telling you on the Windows machine. This would allow you to see whether it is the outgoing traffic being blocked, or the incoming.

  7. #37
    Junior Member
    Join Date
    Jan 2021
    Posts
    15
    Quote Originally Posted by gordonb3 View Post
    Except that 10053 points to either the client or some network component closing the socket. If LMS closed the connection you would have seen a 10054.

    Should also note that retransmit requests are not done by the receiver (i.e. the peer waiting for some response) but the sender. Since the socket is created as "non blocking", not receiving an ACK from the server likely causes the app to go into a loop until finally giving up (and generating the 10053), which explains the 100% CPU you're seeing.

    Is your switch capable of port mirroring? If so you could bind your Pi to receive the same data as the windows machine and use tcpdump to verify what wireshark is telling you on the Windows machine. This would allow you to see whether it is the outgoing traffic being blocked, or the incoming.
    Strangely it seems to be both ways, TCP:[Keep alive] initiated by the host as well as by the client.
    See the following excerpts from a capture with Microsoft Network Monitor (I have actual IP addresses replaced with 'CLIENT' and 'HOST'):

    Time / SourceToDestination / FrameNumber / Description

    9.5011191 CLIENT to HOST 316 TCP:[Keep alive]Flags=...A...., SrcPort=56579, DstPort=9000, PayloadLen=1, Seq=2774461185 - 2774461186, Ack=3913298766, Win=8208 {TCP:125, IPv4:13}
    9.5013944 HOST to CLIENT 317 TCP:[Keep alive ack]Flags=...A...., SrcPort=9000, DstPort=56579, PayloadLen=0, Seq=3913298766, Ack=2774461186, Win=1332 {TCP:125, IPv4:13}

    and

    32.9651553 HOST to CLIENT 3982 TCP:[Keep alive]Flags=...A...., SrcPort=9000, DstPort=56925, PayloadLen=0, Seq=2162354824, Ack=1382232115, Win=229 (scale factor 0x7) = 29312 {TCP:323, IPv4:13}
    32.9652022 CLIENT to HOST 3983 TCP:[Dup Ack #3977]Flags=...A...., SrcPort=56925, DstPort=9000, PayloadLen=0, Seq=1382232115, Ack=2162354825, Win=0 (scale factor 0x8) = 0 {TCP:323, IPv4:13}
    33.0935782 CLIENT to HOST 3984 TCP:[Keep alive]Flags=...A...., SrcPort=56579, DstPort=9000, PayloadLen=1, Seq=2774462837 - 2774462838, Ack=3913300530, Win=8209 {TCP:125, IPv4:13}
    33.0965046 HOST to CLIENT 3985 TCP:[Keep alive ack]Flags=...A...., SrcPort=9000, DstPort=56579, PayloadLen=0, Seq=3913300530, Ack=2774462838, Win=1332 {TCP:125, IPv4:13}
    33.1615348 CLIENT to HOST 3986 HTTP:Request, POST /jsonrpc.js {HTTP:31, TCP:30, IPv4:13}
    33.1618417 HOST to CLIENT 3987 TCP:Flags=...A...., SrcPort=9000, DstPort=55966, PayloadLen=0, Seq=2820479526, Ack=4154110742, Win=2097 {TCP:30, IPv4:13}
    33.3604901 HOST to CLIENT 3988 TCP:Flags=...AP..., SrcPort=3483, DstPort=56585, PayloadLen=30, Seq=2271133931 - 2271133961, Ack=2859076443, Win=583 {TCP:14, IPv4:13}
    33.3608028 CLIENT to HOST 3989 TCP:Flags=...AP..., SrcPort=56585, DstPort=3483, PayloadLen=61, Seq=2859076443 - 2859076504, Ack=2271133961, Win=8208 {TCP:14, IPv4:13}
    33.3655584 HOST to CLIENT 3990 HTTP:Response, HTTP/1.1, Status: Ok, URL: /jsonrpc.js {HTTP:31, TCP:30, IPv4:13}
    33.3737269 HOST to CLIENT 3991 TCP:Flags=...A...., SrcPort=3483, DstPort=56585, PayloadLen=0, Seq=2271133961, Ack=2859076504, Win=583 {TCP:14, IPv4:13}
    33.4051077 HOST to CLIENT 3992 TCP:[Keep alive][Dup Ack #3982]Flags=...A...., SrcPort=9000, DstPort=56925, PayloadLen=0, Seq=2162354824, Ack=1382232115, Win=229 (scale factor 0x7) = 29312 {TCP:323, IPv4:13}
    33.4051521 CLIENT to HOST 3993 TCP:[Dup Ack #3977]Flags=...A...., SrcPort=56925, DstPort=9000, PayloadLen=0, Seq=1382232115, Ack=2162354825, Win=0 (scale factor 0x8) = 0 {TCP:323, IPv4:13}
    34.2371206 HOST to CLIENT 4001 TCP:[Keep alive][Dup Ack #3982]Flags=...A...., SrcPort=9000, DstPort=56925, PayloadLen=0, Seq=2162354824, Ack=1382232115, Win=229 (scale factor 0x7) = 29312 {TCP:323, IPv4:13}
    34.2371686 CLIENT to HOST 4002 TCP:[Request Fast-Retransmit from Seq2162354825]Flags=...A...., SrcPort=56925, DstPort=9000, PayloadLen=0, Seq=1382232115, Ack=2162354825, Win=0 (scale factor 0x8) = 0 {TCP:323, IPv4:13}

    Whenever TCP:[Keep alive] is initiated by the client (squeezelite), the host's (LMS) response is TCP:[Keep alive ack]. That seems to be OK.
    Whenever TCP:[Keep alive] is initiated by the host however, the client's response is TCP:[Dup Ack #frame number]. Is that OK? I was assuming here is where the issue takes off.
    When TCP:[Keep alive][Dup Ack #frame] is sent by the host, the client's response is TCP:[Request Fast-Retransmit from Seq...].

    After some time I get this:

    86.5249184 HOST to CLIENT 5239 TCP:[Keep alive][Request Fast-Retransmit from Seq1382232115]Flags=...A...., SrcPort=9000, DstPort=56925, PayloadLen=0, Seq=2162354824, Ack=1382232115, Win=229 (scale factor 0x7) = 29312 {TCP:323, IPv4:13}
    86.5249691 CLIENT to HOST 5240 TCP:[Request Fast-Retransmit from Seq2162354825]Flags=...A...., SrcPort=56925, DstPort=9000, PayloadLen=0, Seq=1382232115, Ack=2162354825, Win=0 (scale factor 0x8) = 0 {TCP:323, IPv4:13}

    and further down still

    120.2225124 HOST to CLIENT 7665 TCP:Flags=...A...F, SrcPort=9000, DstPort=56925, PayloadLen=0, Seq=2164454887, Ack=1382232115, Win=229 (scale factor 0x7) = 29312 {TCP:323, IPv4:13}

    the latter seems to me the point where the host tears down the connection.

    My current assumption is that in this code
    Code:
    while (running && !new_server) {
        ....
        if ((ev = wait_readwake(ehandles, 1000)) != EVENT_TIMEOUT) {
        ....
        }
        ....
    }
    in function static void slimproto_run() the ev value is unexpected and this results in a tight loop. Does that make sense?

    This exchange above is apparently not on the application level but within the TCP stack. The correct behavior has to be configured on the application level upfront.

  8. #38
    Junior Member
    Join Date
    Jan 2021
    Posts
    15
    Quote Originally Posted by gordonb3 View Post
    Except that 10053 points to either the client or some network component closing the socket. If LMS closed the connection you would have seen a 10054.
    ...
    The doc for WSAECONNABORTED 10053:
    Software caused connection abort. An established connection was aborted by the software in your host computer, possibly due to a data transmission time-out or protocol error.

    mentions protocol error as a possible cause for the abort. Could it be that the client TCP implementation (squeezelite) aborts because of the Keep Alive issue?

    In any case, it would be helpful to test with a squeezelite version that has KeepAlive enabled on the socket.

  9. #39
    Senior Member ralphy's Avatar
    Join Date
    Jan 2006
    Location
    Canada
    Posts
    2,841
    Quote Originally Posted by fpo View Post
    The doc for WSAECONNABORTED 10053:
    Software caused connection abort. An established connection was aborted by the software in your host computer, possibly due to a data transmission time-out or protocol error.

    mentions protocol error as a possible cause for the abort. Could it be that the client TCP implementation (squeezelite) aborts because of the Keep Alive issue?

    In any case, it would be helpful to test with a squeezelite version that has KeepAlive enabled on the socket.
    Thanks for continuing to help track down this problem.

    I've built a win64 r1324-test1 build with keepalive enabled on the slimprotocol socket to try.
    Ralphy

    1-Touch, 5-Classics, 3-Booms, 2-UE Radio
    Squeezebox client builds donations always appreciated.

  10. #40
    Junior Member
    Join Date
    Jan 2021
    Posts
    15
    Quote Originally Posted by ralphy View Post
    Thanks for continuing to help track down this problem.

    I've built a win64 r1324-test1 build with keepalive enabled on the slimprotocol socket to try.
    Unfortunately no change, dropouts still. The KeepAlive behavior seems unchanged. You say you have enabled KeepAlive on the slimprotocol socket. Is this the one with port 9000 or 3483. It is needed on port 9000.

    Here is the traffic on port 9000 prior to the 100% cpu time state:

    056.2501063 HOST to CLIENT 2759 TCP:[Continuation to #2754]Flags=...A...., SrcPort=9000, DstPort=62970, PayloadLen=1460, Seq=1349354153 - 1349355613, Ack=3133506948, Win=229 {TCP:45, IPv4:1}
    056.2501063 HOST to CLIENT 2760 TCP:[Continuation to #2754]Flags=...AP..., SrcPort=9000, DstPort=62970, PayloadLen=730, Seq=1349355613 - 1349356343, Ack=3133506948, Win=229 {TCP:45, IPv4:1}
    056.2501196 CLIENT to HOST 2761 TCP:Flags=...A...., SrcPort=62970, DstPort=9000, PayloadLen=0, Seq=3133506948, Ack=1349356343, Win=1289 {TCP:45, IPv4:1}
    056.6534090 HOST to CLIENT 2762 TCP:Flags=...A...F, SrcPort=9000, DstPort=62970, PayloadLen=0, Seq=1349356343, Ack=3133506948, Win=229 {TCP:45, IPv4:1}
    056.6534705 CLIENT to HOST 2763 TCP:Flags=...A...., SrcPort=62970, DstPort=9000, PayloadLen=0, Seq=3133506948, Ack=1349356344, Win=1289 {TCP:45, IPv4:1}
    140.4406135 CLIENT to HOST 4733 TCP:[Dup Ack #2763]Flags=...A...., SrcPort=62970, DstPort=9000, PayloadLen=0, Seq=3133506948, Ack=1349356344, Win=1329 {TCP:45, IPv4:1}
    140.4409142 HOST to CLIENT 4734 TCP:Flags=.....R.., SrcPort=9000, DstPort=62970, PayloadLen=0, Seq=1349356344, Ack=0, Win=0 {TCP:45, IPv4:1}

    Host sends data: PayloadLen 1460 then 730 (streambuffer is full at this time seamingly)
    Client acknowledges the data up to seq 1349356343
    Host acks the ack and sends FIN with seq 1349356343
    Client acks with seq 1349356344 (note: +1)
    There is no traffic on port 9000 between time 56.653 and 140.44. There is traffic on port 3483 however (not shown)
    Client dup acks with seq 1349356344 again
    Host resets with seq 1349356344

    After that there is 100% cpu time.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •