how can i complement a DNA matrix using a binary vector?

2 views (last 30 days)
I have a DNA matrix, its length for example is m*4n.
for example:
B = 'GATT' 'AACT' 'ACAC' 'TTGA' 'GGCT'
'GCAC' 'TCAT' 'GTTC' 'GCCT' 'TTTA'
'AACG' 'GTTA' 'ACGT' 'CGTC' 'TGGA'
'CTAC' 'AAAA' 'GGGC' 'CCCT' 'TCGT'
'GTGT' 'GCGG' 'GTTT' 'TTGC' 'ATTA'
i have also a vector of real numbers X = {xi, i = 1..m*4n}.
Taking mod(X,1) to keep the real numbers in the range [o,1] .
the output will be like X = [0.223 0.33 0.71 0.44 0.91 0.32 0.11 ....... m*4n];
then need to transform the obtained result into a binary vector by applying the
f(x)={0 ,0 < X(i,j) 0.5; 1 ,0.5 < X(i,j) 1;)
the output according the previous values will be like X = [0010100 ....]
if X(i,j)=1, then A(i,j) is complemented otherwise it is unchanged.
i tried to code this part as following but it didn't work:
%%maping X chaotic sequence from real numbers to binary sequence using threshold function
X = v(:,3);
X(257)=[];
disp (X);
mode (X,1);
for i=1
for j=1:256
if ((X(i,j)> 0) && (X(i,j)<= .5))
X(i,j) = 0;
elseif ((X(i,j)> .5) && (X(i,j)<= 1))
X(i,j) = 1;
end
end
end
disp(X);
and suppose i can get the binary vector ,how to complement the DNa matrix A using the sequence X ??
%%P.S. the complement of A - T, T - A, C - G, G - C
To be more specific i need the following:
1- Apply mode (X,1) on the vector to get the values in tha rang of 0,1.
2- Mapping the real number vector into a binary vector by applying this function f(x)={0 ,0 < X(i,j) ≤ 0.5; 1 ,0.5 < X(i,j) ≤ 1;).
3- Using this binary vector X(i,j) to complement the DNA matrix A(i,j) by applying the condition, if X(i,j)=1 then A(i,j) is complemented , otherwise it is unchanged.

Accepted Answer

John BG
John BG on 22 Dec 2016
Does the following help?
in this example m=5
A0='ACGT'
B='0000'
m=5
for k=1:1:m
B=[B;A0(randi(4,1,4))]
end
B(1,:)=[]
X=rand(1,5)
fX=round(X)
for k=1:1:m
if fX(k)==1
L=B(k,:)
L=strrep(L,'A','0'); L=strrep(L,'T','A'); L=strrep(L,'0','T'); % A - T swap
L=strrep(L,'C','0'); L=strrep(L,'G','C'); L=strrep(L,'0','G'); % C - G swap
end
B(k,:)=L
end
  2 Comments
John BG
John BG on 22 Dec 2016
Sure
please comment and correction because I don't have a clue about genetics:
1. the key string containing one of each genetic symbols, it's not a gene coding, just a way to put together the alphabet of 4 characters:
A0='ACGT'
2. initialise result variable
B='0000'
3. m is the amount of genes, in your example you have shown 25, correct me if wrong, humans have 23.
m=5
4. generating 5 random genes, this is just to test the next step, the answer to your question works.
Replace this random generation with whatever sequence you want to process.
For instance you can have the input sequence in a text file. Do you know how to use the command textscan to load text files into MATLAB variables? I can show you how if you don't
for k=1:1:m
B=[B;A0(randi(4,1,4))]
end
B(1,:)=[]
5. generating another random X sequence, for test purposes, only, replace X with your X sequence
X=rand(1,5)
6. The MATLAB command round() does precisely the 'polarising' you requested:
X(i)<0.5 then X(i)=0, else X(i)>=0.5 then X(i)=1
This can be modified if you want to for instance
X(i)<=0.5 then 0 if X(i)>0. then X(i)=1
fX=round(X)
7. reversing the sequence B according to X
for k=1:1:m
if fX(k)==1
L=B(k,:)
L=strrep(L,'A','0'); L=strrep(L,'T','A'); L=strrep(L,'0','T'); % A - T swap
L=strrep(L,'C','0'); L=strrep(L,'G','C'); L=strrep(L,'0','G'); % C - G swap
end
B(k,:)=L
end
would it be possible for you to click on the ACCEPT ANSWER so I can get the points?
If there is any further steps you would like to develop before accepting my answer please ask and I will do my best.
Appreciating time and attention, awaiting answer
John BG

Sign in to comment.

More Answers (2)

James Tursa
James Tursa on 21 Dec 2016
Edited: James Tursa on 21 Dec 2016
Not quite sure I fully understand, but maybe something like this?
mask = mod(X,1) > 0.5; % logical indexes of the characters to flip
Bmask = B(mask); % get the characters to flip
Bmask(Bmask=='T') = 'a'; % flip T to a
Bmask(Bmask=='A') = 't'; % flip A to t
Bmask(Bmask=='C') = 'g'; % flip C to g
Bmask(Bmask=='G') = 'c'; % flip G to c
B(mask) = upper(Bmask); % replace the original characters with their flipped versions
or using ismember:
mask = mod(X,1) > 0.5; % logical indexes of the characters to flip
Bmask = B(mask); % get the characters to flip
[~,loc] = ismember(Bmask,'ATCG'); % identify the characters to flip
S = 'TAGC'; % the flipped reference string
B(mask) = S(loc); % replace the masked characters with their flipped versions
  1 Comment
M.A.Fathy
M.A.Fathy on 23 Dec 2016
thanks for your reply >> i am working on a matrix.. i want to apply this complement rule on each value in this matrix .. how can i apply this on the whole matrix using the X vector? should i mapping this vector into a matrix too or what!

Sign in to comment.


David Barry
David Barry on 21 Dec 2016
X = [0.223 0.33 0.71 0.44 0.91 0.32 0.11];
X(X<= 0.5 & X >0) = 0;
X(X>0.5 & X<=1) = 1;
  1 Comment
M.A.Fathy
M.A.Fathy on 22 Dec 2016
Edited: M.A.Fathy on 23 Dec 2016
thanks for your reply. but when i use & instead of && , it gave me an error

Sign in to comment.

Categories

Find more on Genomics and Next Generation Sequencing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!