Tell the format of bytes sent via tcp?

14 views (last 30 days)
Ronan
Ronan on 13 May 2015
Commented: Ronan on 18 May 2015
I m sending a number of integers wirelessly to matlab and I want to know when its finished. So i have been using the code while(get(t, 'BytesAvailable') > 0)) where t is my tcp/ip object. So when there are no bytes available the while loop escapes. So this is fine for when bytes have stopped sending, however my bytes are sending incremently. ie they start and stop meaning the while loop will escape before i have received my total bytes. Note: this has nothing to do with InputBufferSize the stop start of bytes is simply the behaviour of the source of what i m sending from. When you are reading bytes you specify the format, eg fscanf(t, %d,1) (for ints) so i was hoping to put some kind of delimiter which would tell me when integers have stopped reading. eg if the byte is not an integer and something like a character then you could tell when to finish reading. However how do you read bytes that you dont know the format of? So before i used delimiters to know when to finish reading an integer. eg in arduino i had something like Serial.parseInt() when you send an int to arduino following by a character the parseInt function would escape as soon as it hit the character. Alternatively, another way of looking at it is, say you had a buffer of bytes where some of these bytes were ints and some were characters, how do you know when to read with a different format type? I looked at ways you can use bytesavailablefcn to initialize a callback function of when bytes have finished but this wasnt really approprtiate to what i want to do.
  4 Comments
Guillaume
Guillaume on 14 May 2015
Edited: Guillaume on 14 May 2015
@Walter, Yes what's transmitted are bytes. I made some simplifications here to avoid confusing the issue.
From my reading of the OP, what is being transmitted is exclusively text (encoded as bytes of course) and when Robert talks of sending integer, what he means is that he sends the text representation of the digits of the integer (encoded as decimal). So most of the time, when he talks about bytes he's actually talking about characters.
edit: Actually, rereading the OP, I'm very confused, are we talking about a TCP connection or a serial connection? As far as I know fscanf is not a member of tcpclient, and Serial.ParseInt does sound like a serial connection operation. Furthermore, most of the OP makes a lot more sense if what he's writing reading are characters. Yet, the title and the start of the post talks about TCP.
Ronan
Ronan on 18 May 2015
I was referring to a TCP connection. You can use fscanf for TCP objects. When i mentioned Serial.ParseInt i was just giving an example of how i used delimiters before with arduino serial data. Using delimiters was something i was considering trying to read bytes from a TCP object in matlab.

Sign in to comment.

Accepted Answer

Guillaume
Guillaume on 14 May 2015
Edited: Guillaume on 14 May 2015
As per my comment to your question, I think you're confusing bytes and characters and you're dealing exclusively with transmitting characters (which ultimately are transferred as bytes, but that translation is hidden from you by fprintf).
To solve your problem, what you want to do is read the stream one character at a time and check whether or not that character is a digit:
digits = '';
while strcmp(t.Status, 'open')
c = fscanf(t, '%c', 1); %read one character only
if c >= '0' || c <= '9'
%character is a digit
digits = [digits c]; %add to read digits
else
%character is not digit, convert the digits that have been read to integer
number = str2double(digits ); %convert string to number
digits = ''; %reset string to nothing
%do something with number
end
end
  1 Comment
Ronan
Ronan on 18 May 2015
Thank you for the response. I will give this a go because previously i had success when i was practising with a text file populated with integers and characters, to just read everything as a character.

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 14 May 2015
Other than packet headers (and trailers), TCP (and UDP) just send bytes. The bytes are not marked as to how they are intended to be interpreted. Interpretation is up to the application.
It is very common for applications to pre-define the number of interpretation of bytes according to their relative position. For example it might define that the first 6 bytes in the stream are a constant string that serves to identify that the program, and it might define the next 2 bytes as being an unsigned 16 bit integer that represents a version number, and it might define everything up to the next 0x0d 0x0a (CR-LF) as being a printable "banner" that might be displayed to the user. It might define that the byte after that is to be binary 0 if needed to reach an even byte boundary. And so on: it can be a mix of fixed-length and variable-length information (including information whose end is marked by a pre-defined terminator.)
If there is variable-length binary information to be transferred, then it is almost never done by using a termination marker. Instead, the information is almost always proceeded by a binary count of the number of bytes that the binary information will occupy. In cases where mixed binary and string information is being sent, it is not uncommon for strings to be proceeded by a binary count of the bytes occupied. Having a count makes processing faster, as code that is not interested in the string can skip forward that many bytes in the stream instead of having to examine each byte to see if the terminator has been reached.
Some protocols are primarily text based. An example is the SMTP email protocol, where text commands are sent and text responses are received. Those protocols seldom require that every string block be proceeded by a byte count. None-the-less one of the text commands might signify in the protocol that it is to switch temporarily into a binary protocol, such as to transfer a block of binary data efficiently.
There is a standard for sending blocks of data that might be of varying data-type or which might be in different byte orders: the standard is known as XDR. It has its uses, but more of the time people define TCP protocols as requiring that data be transmitted in Network Byte Order (which is Big Endian
  6 Comments
Guillaume
Guillaume on 18 May 2015
Ronan, go read my first comment on your question at the top of this page.
You're talking about reading integers represented as a string of decimal digits. Indeed the number of characters required depends on the magnitude of the number (although if an upper bound is known, you could decide on a fixed length string and pad the shorter numbers with '0').
Walter is talking about reading integers as bytes, the same way they're encoded in memory. You just have to agree on the number of bits / bytes used to encode your integer and read that number of bytes. For example, 16 bits (2 bytes) can encode all unsigned integer from 0 to 2^16-1 = 65535.
The latter is a very standard way of transmitting data over networks. You just agree beforehand on the number of bytes used to encode number. See for example the description of the TCP header format. The first number in the header is a 16-bit unsigned integer. Hence you're always reading two bytes for the first number.
Ronan
Ronan on 18 May 2015
Sorry, i was doing something stupid. All of my focus was going into what matlab was doing. You were right, i was actually sending the integers as a string of characters. So i m using an arduino board to transmit bytes over wifi and i never considered the format of Serial.print() on arduino. I was aware Serial.write() only sends one byte and i took it for granted and thought Serial.print() automatically sends the variables in their original format. I was aware that integers are normally around 2 bytes but when i was getting 5 bytes for 3 digit integers on matlab i assumed that maybe it was because of extra data like new line or something weird in matlab. So it makes sense that a string of 3 digits makes up 3 characters and characters equate to about 2 bytes also making up around 5 bytes. So having that said, reading one character at a time should work out.

Sign in to comment.

Categories

Find more on MATLAB Support Package for Arduino Hardware in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!