Tell the format of bytes sent via tcp?

Question

0 votes

I m sending a number of integers wirelessly to matlab and I want to know when its finished. So i have been using the code while(get(t, 'BytesAvailable') > 0)) where t is my tcp/ip object. So when there are no bytes available the while loop escapes. So this is fine for when bytes have stopped sending, however my bytes are sending incremently. ie they start and stop meaning the while loop will escape before i have received my total bytes. Note: this has nothing to do with InputBufferSize the stop start of bytes is simply the behaviour of the source of what i m sending from. When you are reading bytes you specify the format, eg fscanf(t, %d,1) (for ints) so i was hoping to put some kind of delimiter which would tell me when integers have stopped reading. eg if the byte is not an integer and something like a character then you could tell when to finish reading. However how do you read bytes that you dont know the format of? So before i used delimiters to know when to finish reading an integer. eg in arduino i had something like Serial.parseInt() when you send an int to arduino following by a character the parseInt function would escape as soon as it hit the character. Alternatively, another way of looking at it is, say you had a buffer of bytes where some of these bytes were ints and some were characters, how do you know when to read with a different format type? I looked at ways you can use bytesavailablefcn to initialize a callback function of when bytes have finished but this wasnt really approprtiate to what i want to do.

4 Comments
Show 2 older comments Hide 2 older comments

Guillaume on 14 May 2015

Edited: Guillaume on 14 May 2015

Open in MATLAB Online

I think you are confusing bytes and characters. Semantically they are two very different things.

When you use

fscanf(t, '%d')

you are not reading bytes as integers, but reading characters as integers. This is important as there are two ways you could encode integers to send over a serial port:

Use text. For example to send the integer 1234, you'd encode it as text: '1234' and you'd send the bytes [49 50 51 52] (the ascii values of the characters). The translation from number to character and then character to byte is what fprintf does for you.
Use binary. For example to send the integer 1234 , using 32-bit representation you'd send the bytes [4 210] (using big-endian encoding) because 1234 = 4 * 256 + 210. That translation is what fwrite does for you.

As you can see the actual bytes that are sent over the wire are very different.

In that context, "if the byte is not an integer and something like a character" is meaningless. A byte is always an integer (in the range 0-255). And any byte in the range 0-127 is also always the representation of an ASCII character.

In your case, it sounds like you're using text exclusively to transmit data over the serial port. Hence you're always reading characters. So what you meant is "if the character is not a digit (0-9) but some other character" (You're also mixing up digits (the characters '0' to '9') and integers)

Guillaume on 14 May 2015

Edited: Guillaume on 14 May 2015

@Walter, Yes what's transmitted are bytes. I made some simplifications here to avoid confusing the issue.

From my reading of the OP, what is being transmitted is exclusively text (encoded as bytes of course) and when Robert talks of sending integer, what he means is that he sends the text representation of the digits of the integer (encoded as decimal). So most of the time, when he talks about bytes he's actually talking about characters.

edit: Actually, rereading the OP, I'm very confused, are we talking about a TCP connection or a serial connection? As far as I know fscanf is not a member of tcpclient, and Serial.ParseInt does sound like a serial connection operation. Furthermore, most of the OP makes a lot more sense if what he's writing reading are characters. Yet, the title and the start of the post talks about TCP.

Ronan on 18 May 2015

I was referring to a TCP connection. You can use fscanf for TCP objects. When i mentioned Serial.ParseInt i was just giving an example of how i used delimiters before with arduino serial data. Using delimiters was something i was considering trying to read bytes from a TCP object in matlab.

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Guillaume on 14 May 2015

Edited: Guillaume on 14 May 2015

Open in MATLAB Online

0 votes

As per my comment to your question, I think you're confusing bytes and characters and you're dealing exclusively with transmitting characters (which ultimately are transferred as bytes, but that translation is hidden from you by fprintf).

To solve your problem, what you want to do is read the stream one character at a time and check whether or not that character is a digit:

digits = '';
while strcmp(t.Status, 'open')
   c = fscanf(t, '%c', 1); %read one character only
   if c >= '0' || c <= '9'
     %character is a digit
     digits = [digits c]; %add to read digits
  else
     %character is not digit, convert the digits that have been read to integer
     number = str2double(digits ); %convert string to number
     digits = '';  %reset string to nothing
     %do something with number
   end
end

1 Comment
Show -1 older comments Hide -1 older comments

Ronan on 18 May 2015

Thank you for the response. I will give this a go because previously i had success when i was practising with a text file populated with integers and characters, to just read everything as a character.

Sign in to comment.

Answer 2

Walter Roberson on 14 May 2015

0 votes

Other than packet headers (and trailers), TCP (and UDP) just send bytes. The bytes are not marked as to how they are intended to be interpreted. Interpretation is up to the application.

It is very common for applications to pre-define the number of interpretation of bytes according to their relative position. For example it might define that the first 6 bytes in the stream are a constant string that serves to identify that the program, and it might define the next 2 bytes as being an unsigned 16 bit integer that represents a version number, and it might define everything up to the next 0x0d 0x0a (CR-LF) as being a printable "banner" that might be displayed to the user. It might define that the byte after that is to be binary 0 if needed to reach an even byte boundary. And so on: it can be a mix of fixed-length and variable-length information (including information whose end is marked by a pre-defined terminator.)

If there is variable-length binary information to be transferred, then it is almost never done by using a termination marker. Instead, the information is almost always proceeded by a binary count of the number of bytes that the binary information will occupy. In cases where mixed binary and string information is being sent, it is not uncommon for strings to be proceeded by a binary count of the bytes occupied. Having a count makes processing faster, as code that is not interested in the string can skip forward that many bytes in the stream instead of having to examine each byte to see if the terminator has been reached.

Some protocols are primarily text based. An example is the SMTP email protocol, where text commands are sent and text responses are received. Those protocols seldom require that every string block be proceeded by a byte count. None-the-less one of the text commands might signify in the protocol that it is to switch temporarily into a binary protocol, such as to transfer a block of binary data efficiently.

There is a standard for sending blocks of data that might be of varying data-type or which might be in different byte orders: the standard is known as XDR. It has its uses, but more of the time people define TCP protocols as requiring that data be transmitted in Network Byte Order (which is Big Endian

6 Comments
Show 4 older comments Hide 4 older comments

Guillaume on 18 May 2015

Ronan, go read my first comment on your question at the top of this page.

You're talking about reading integers represented as a string of decimal digits. Indeed the number of characters required depends on the magnitude of the number (although if an upper bound is known, you could decide on a fixed length string and pad the shorter numbers with '0').

Walter is talking about reading integers as bytes, the same way they're encoded in memory. You just have to agree on the number of bits / bytes used to encode your integer and read that number of bytes. For example, 16 bits (2 bytes) can encode all unsigned integer from 0 to 2^16-1 = 65535.

The latter is a very standard way of transmitting data over networks. You just agree beforehand on the number of bytes used to encode number. See for example the description of the TCP header format. The first number in the header is a 16-bit unsigned integer. Hence you're always reading two bytes for the first number.

Ronan on 18 May 2015

Sorry, i was doing something stupid. All of my focus was going into what matlab was doing. You were right, i was actually sending the integers as a string of characters. So i m using an arduino board to transmit bytes over wifi and i never considered the format of Serial.print() on arduino. I was aware Serial.write() only sends one byte and i took it for granted and thought Serial.print() automatically sends the variables in their original format. I was aware that integers are normally around 2 bytes but when i was getting 5 bytes for 3 digit integers on matlab i assumed that maybe it was because of extra data like new line or something weird in matlab. So it makes sense that a string of 3 digits makes up 3 characters and characters equate to about 2 bytes also making up around 5 bytes. So having that said, reading one character at a time should work out.

Sign in to comment.

Tell the format of bytes sent via tcp?

4 Comments
Show 2 older comments Hide 2 older comments

Accepted Answer

1 Comment
Show -1 older comments Hide -1 older comments

More Answers (1)

6 Comments
Show 4 older comments Hide 4 older comments

Categories

Tags

Community Treasure Hunt

Tell the format of bytes sent via tcp?

4 Comments Show 2 older comments Hide 2 older comments

Accepted Answer

1 Comment Show -1 older comments Hide -1 older comments

More Answers (1)

6 Comments Show 4 older comments Hide 4 older comments

Categories

Tags

See Also

Community Treasure Hunt

4 Comments
Show 2 older comments Hide 2 older comments

1 Comment
Show -1 older comments Hide -1 older comments

6 Comments
Show 4 older comments Hide 4 older comments