Unique name detection in table headers

2 views (last 30 days)
H.P.
H.P. on 4 May 2023
Answered: Stephen23 on 5 May 2023
Hi,
I have a bunch of data that I need to process. My data consists of plot information with multiple traces within one plot. The generated data is written into multiple tables within one .txt file. Each table represents the information of a single trace. As each trace within a file belongs to the same plot, all tables have the same amount of data points, resulting in an equal number of rows and sorting (This makes my life a lot easier) A picture is attached to this post. However, the actual information of a single trace is saved in the last column, the other colums show parameters that the data point is corresponding to. This results in a lot of abundant stuff within a single file that does not provide any additional information.
As all tables have the same formatting and sorting, I want to extract the last column of each table within the file and generate one single table from it. I was thinking of a solution in which a new column is created whenever a new header name is detected, which would allow for a variable amount of traces to be detected. I do not have the programming knowledge to pull this off however, any tip how to tackle this problem is greatly appreciated.
  2 Comments
Stephen23
Stephen23 on 4 May 2023
Please upload a representative data file by clicking the paperclip button.
This may be your actual data, or if that cannot be distributed, random data with exactly the same file format.

Sign in to comment.

Answers (3)

Vilém Frynta
Vilém Frynta on 4 May 2023
Edited: Vilém Frynta on 4 May 2023
Hello,
I extracted all your data from last columns, which is 300 numbers. And because I knew that every table contains 10 values, I could easily reshape this 300x1 long vector into 10x30 table, where each column = last column of every table.
There's more approaches, and the one you brought up (new column is created whenever a new header is detected) could definitely work. But when we know that every 10 values are 1 column, we can just use this knowledge + indexing + reshaping and do it in more simple way.
Hope it's understandable and that I understand your question correctly.
Hope I helped.
T = readtable('test_data_for_forum.txt');
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
freq Width indep_index_1_ plot_vs_mag_Zc1C__AC_Space__0_0____ _____ _____ ______________ ___________________________________ 8e+10 1 1 166.53 8e+10 1 2 157.81 8e+10 1 3 150.77 8e+10 1 4 144.91 8e+10 1 5 139.92 8e+10 1 6 135.63 8e+10 1 7 131.89 8e+10 1 8 128.61 8e+10 1 9 125.71 8e+10 1 10 123.14 NaN NaN NaN NaN 8e+10 2 1 148.34 8e+10 2 2 141.61 8e+10 2 3 136 8e+10 2 4 131.19 8e+10 2 5 127.02 8e+10 2 6 123.38 8e+10 2 7 120.17 8e+10 2 8 117.33 8e+10 2 9 114.8 8e+10 2 10 112.55 NaN NaN NaN NaN 8e+10 3 1 134.84 8e+10 3 2 129.28 8e+10 3 3 124.6 8e+10 3 4 120.55 8e+10 3 5 116.99 8e+10 3 6 113.85 8e+10 3 7 111.05 8e+10 3 8 108.57 8e+10 3 9 106.34 8e+10 3 10 104.34 NaN NaN NaN NaN 8e+10 4 1 124.17 8e+10 4 2 119.28 8e+10 4 3 115.31 8e+10 4 4 111.82 8e+10 4 5 108.74 8e+10 4 6 106 8e+10 4 7 103.55 8e+10 4 8 101.35 8e+10 4 9 99.379 8e+10 4 10 97.6 NaN NaN NaN NaN 8e+10 5 1 115.23 8e+10 5 2 110.89 8e+10 5 3 107.43 8e+10 5 4 104.4 8e+10 5 5 101.7 8e+10 5 6 99.287 8e+10 5 7 97.117 8e+10 5 8 95.163 8e+10 5 9 93.4 8e+10 5 10 91.807 NaN NaN NaN NaN 8e+10 6 1 107.62 8e+10 6 2 103.78 8e+10 6 3 100.74 8e+10 6 4 98.074 8e+10 6 5 95.687 8e+10 6 6 93.539 8e+10 6 7 91.603 8e+10 6 8 89.853 8e+10 6 9 88.268 8e+10 6 10 86.832 NaN NaN NaN NaN 8e+10 7 1 101.05 8e+10 7 2 97.598 8e+10 7 3 94.905 8e+10 7 4 92.533 8e+10 7 5 90.404 8e+10 7 6 88.482 8e+10 7 7 86.742 8e+10 7 8 85.165 8e+10 7 9 83.733 8e+10 7 10 82.432 NaN NaN NaN NaN 8e+10 8 1 95.287 8e+10 8 2 92.162 8e+10 8 3 89.752 8e+10 8 4 87.628 8e+10 8 5 85.716 8e+10 8 6 83.985 8e+10 8 7 82.413 8e+10 8 8 80.984 8e+10 8 9 79.684 8e+10 8 10 78.499 NaN NaN NaN NaN 8e+10 9 1 90.194 8e+10 9 2 87.337 8e+10 9 3 85.164 8e+10 9 4 83.249 8e+10 9 5 81.521 8e+10 9 6 79.953 8e+10 9 7 78.526 8e+10 9 8 77.225 8e+10 9 9 76.038 8e+10 9 10 74.955 NaN NaN NaN NaN 8e+10 10 1 85.651 8e+10 10 2 83.019 8e+10 10 3 81.048 8e+10 10 4 79.311 8e+10 10 5 77.741 8e+10 10 6 76.313 8e+10 10 7 75.011 8e+10 10 8 73.821 8e+10 10 9 72.734 8e+10 10 10 71.74 NaN NaN NaN NaN 8e+10 1 1 22.122 8e+10 1 2 32.985 8e+10 1 3 41.093 8e+10 1 4 47.587 8e+10 1 5 52.98 8e+10 1 6 57.562 8e+10 1 7 61.511 8e+10 1 8 64.952 8e+10 1 9 67.974 8e+10 1 10 70.642 NaN NaN NaN NaN 8e+10 2 1 20.64 8e+10 2 2 30.549 8e+10 2 3 37.898 8e+10 2 4 43.775 8e+10 2 5 48.658 8e+10 2 6 52.809 8e+10 2 7 56.391 8e+10 2 8 59.514 8e+10 2 9 62.258 8e+10 2 10 64.683 NaN NaN NaN NaN 8e+10 3 1 19.816 8e+10 3 2 29.02 8e+10 3 3 35.822 8e+10 3 4 41.241 8e+10 3 5 45.735 8e+10 3 6 49.55 8e+10 3 7 52.84 8e+10 3 8 55.707 8e+10 3 9 58.226 8e+10 3 10 60.451 NaN NaN NaN NaN 8e+10 4 1 19.366 8e+10 4 2 27.924 8e+10 4 3 34.297 8e+10 4 4 39.354 8e+10 4 5 43.534 8e+10 4 6 47.076 8e+10 4 7 50.124 8e+10 4 8 52.778 8e+10 4 9 55.106 8e+10 4 10 57.162 NaN NaN NaN NaN 8e+10 5 1 18.935 8e+10 5 2 27.102 8e+10 5 3 33.066 8e+10 5 4 37.801 8e+10 5 5 41.709 8e+10 5 6 45.015 8e+10 5 7 47.855 8e+10 5 8 50.325 8e+10 5 9 52.489 8e+10 5 10 54.398 NaN NaN NaN NaN 8e+10 6 1 18.547 8e+10 6 2 26.375 8e+10 6 3 32.047 8e+10 6 4 36.525 8e+10 6 5 40.207 8e+10 6 6 43.311 8e+10 6 7 45.973 8e+10 6 8 48.282 8e+10 6 9 50.304 8e+10 6 10 52.085 NaN NaN NaN NaN 8e+10 7 1 18.212 8e+10 7 2 25.74 8e+10 7 3 31.156 8e+10 7 4 35.407 8e+10 7 5 38.888 8e+10 7 6 41.815 8e+10 7 7 44.318 8e+10 7 8 46.486 8e+10 7 9 48.381 8e+10 7 10 50.048 NaN NaN NaN NaN 8e+10 8 1 17.914 8e+10 8 2 25.172 8e+10 8 3 30.358 8e+10 8 4 34.407 8e+10 8 5 37.709 8e+10 8 6 40.477 8e+10 8 7 42.838 8e+10 8 8 44.88 8e+10 8 9 46.661 8e+10 8 10 48.227 NaN NaN NaN NaN 8e+10 9 1 17.643 8e+10 9 2 24.656 8e+10 9 3 29.633 8e+10 9 4 33.498 8e+10 9 5 36.639 8e+10 9 6 39.263 8e+10 9 7 41.497 8e+10 9 8 43.424 8e+10 9 9 45.103 8e+10 9 10 46.577 NaN NaN NaN NaN 8e+10 10 1 17.393 8e+10 10 2 24.18 8e+10 10 3 28.966 8e+10 10 4 32.664 8e+10 10 5 35.657 8e+10 10 6 38.15 8e+10 10 7 40.268 8e+10 10 8 42.092 8e+10 10 9 43.679 8e+10 10 10 45.07 NaN NaN NaN NaN 8e+10 1 1 610.72 8e+10 1 2 591.97 8e+10 1 3 592.77 8e+10 1 4 600.87 8e+10 1 5 612.07 8e+10 1 6 624.49 8e+10 1 7 637.18 8e+10 1 8 649.68 8e+10 1 9 661.73 8e+10 1 10 673.19 NaN NaN NaN NaN 8e+10 2 1 539.77 8e+10 2 2 505.01 8e+10 2 3 496.58 8e+10 2 4 497.15 8e+10 2 5 501.81 8e+10 2 6 508.45 8e+10 2 7 516.02 8e+10 2 8 523.93 8e+10 2 9 531.86 8e+10 2 10 539.6 NaN NaN NaN NaN 8e+10 3 1 524.05 8e+10 3 2 464.86 8e+10 3 3 448.25 8e+10 3 4 443.36 8e+10 3 5 443.63 8e+10 3 6 446.53 8e+10 3 7 450.83 8e+10 3 8 455.84 8e+10 3 9 461.17 8e+10 3 10 466.59 NaN NaN NaN NaN 8e+10 4 1 556.76 8e+10 4 2 451.98 8e+10 4 3 430.96 8e+10 4 4 423.47 8e+10 4 5 421.61 8e+10 4 6 422.66 8e+10 4 7 425.33 8e+10 4 8 428.88 8e+10 4 9 432.91 8e+10 4 10 437.16 NaN NaN NaN NaN 8e+10 5 1 582.98 8e+10 5 2 447.02 8e+10 5 3 418.86 8e+10 5 4 409.68 8e+10 5 5 406.6 8e+10 5 6 406.5 8e+10 5 7 408.07 8e+10 5 8 410.6 8e+10 5 9 413.7 8e+10 5 10 417.08 NaN NaN NaN NaN 8e+10 6 1 591.41 8e+10 6 2 443.22 8e+10 6 3 411.7 8e+10 6 4 400.54 8e+10 6 5 395.97 8e+10 6 6 394.64 8e+10 6 7 395.17 8e+10 6 8 396.79 8e+10 6 9 399.07 8e+10 6 10 401.73 NaN NaN NaN NaN 8e+10 7 1 601.41 8e+10 7 2 441.42 8e+10 7 3 406.79 8e+10 7 4 393.85 8e+10 7 5 388 8e+10 7 6 385.64 8e+10 7 7 385.29 8e+10 7 8 386.15 8e+10 7 9 387.76 8e+10 7 10 389.82 NaN NaN NaN NaN 8e+10 8 1 612.31 8e+10 8 2 440.87 8e+10 8 3 403.34 8e+10 8 4 388.81 8e+10 8 5 381.82 8e+10 8 6 378.57 8e+10 8 7 377.47 8e+10 8 8 377.69 8e+10 8 9 378.73 8e+10 8 10 380.28 NaN NaN NaN NaN 8e+10 9 1 623.73 8e+10 9 2 441.16 8e+10 9 3 400.9 8e+10 9 4 384.91 8e+10 9 5 376.91 8e+10 9 6 372.87 8e+10 9 7 371.12 8e+10 9 8 370.78 8e+10 9 9 371.33 8e+10 9 10 372.45 NaN NaN NaN NaN 8e+10 10 1 635.43 8e+10 10 2 442 8e+10 10 3 399.17 8e+10 10 4 381.84 8e+10 10 5 372.91 8e+10 10 6 368.17 8e+10 10 7 365.85 8e+10 10 8 365.03 8e+10 10 9 365.16 8e+10 10 10 365.9
idx = ~isnan(table2array(T(:,4))); % index of ALL your numbers
v = table2array(T(idx,4)); % all your numbers from last column
v = reshape(v, 10, 30) % reshape 300 numbers into table
v = 10×30
166.5316 148.3423 134.8450 124.1679 115.2304 107.6222 101.0452 95.2869 90.1938 85.6509 22.1223 20.6401 19.8161 19.3659 18.9346 18.5474 18.2120 17.9137 17.6429 17.3934 610.7228 539.7700 524.0528 556.7579 582.9842 591.4124 601.4065 612.3104 623.7318 635.4279 157.8130 141.6124 129.2754 119.2830 110.8930 103.7800 97.5982 92.1624 87.3369 83.0192 32.9850 30.5492 29.0200 27.9242 27.1017 26.3746 25.7398 25.1721 24.6556 24.1795 591.9658 505.0126 464.8621 451.9789 447.0172 443.2228 441.4166 440.8735 441.1579 441.9953 150.7727 135.9954 124.6015 115.3060 107.4318 100.7428 94.9047 89.7525 85.1645 81.0480 41.0935 37.8983 35.8216 34.2973 33.0663 32.0474 31.1558 30.3581 29.6329 28.9655 592.7707 496.5848 448.2505 430.9639 418.8575 411.6989 406.7851 403.3426 400.9035 399.1700 144.9061 131.1890 120.5458 111.8247 104.4011 98.0740 92.5330 87.6284 83.2494 79.3110 47.5868 43.7749 41.2410 39.3541 37.8010 36.5253 35.4072 34.4070 33.4984 32.6637 600.8686 497.1515 443.3584 423.4690 409.6847 400.5365 393.8539 388.8132 384.9140 381.8356 139.9234 127.0240 116.9879 108.7436 101.7029 95.6865 90.4036 85.7161 81.5213 77.7410 52.9805 48.6579 45.7349 43.5343 41.7093 40.2067 38.8885 37.7091 36.6387 35.6567 612.0725 501.8093 443.6328 421.6066 406.6041 395.9663 387.9962 381.8210 376.9082 372.9149 135.6311 123.3815 113.8450 106.0013 99.2872 93.5395 88.4816 83.9847 79.9529 76.3130 57.5618 52.8089 49.5505 47.0759 45.0147 43.3110 41.8149 40.4765 39.2627 38.1502 624.4881 508.4519 446.5345 422.6620 406.5008 394.6432 385.6385 378.5678 372.8679 368.1750 131.8937 120.1732 111.0541 103.5505 97.1174 91.6029 86.7421 82.4131 78.5256 75.0107 61.5114 56.3907 52.8403 50.1245 47.8555 45.9728 44.3183 42.8382 41.4966 40.2682 637.1845 516.0189 450.8281 425.3265 408.0672 395.1659 385.2869 377.4682 371.1183 365.8536 128.6129 117.3315 108.5655 101.3535 95.1635 89.8527 85.1650 80.9843 77.2249 73.8214 64.9523 59.5138 55.7074 52.7780 50.3247 48.2822 46.4862 44.8796 43.4239 42.0920 649.6821 523.9312 455.8387 428.8845 410.6041 396.7885 386.1495 377.6854 370.7788 365.0279 125.7147 114.8033 106.3391 99.3788 93.4005 88.2682 83.7331 79.6839 76.0385 72.7344 67.9736 62.2579 58.2255 55.1064 52.4889 50.3035 48.3808 46.6609 45.1031 43.6787 661.7283 531.8577 461.1721 432.9149 413.6958 399.0684 387.7576 378.7258 371.3318 365.1573 123.1420 112.5459 104.3419 97.6004 91.8072 86.8320 82.4318 78.4993 74.9554 71.7403 70.6423 64.6832 60.4506 57.1623 54.3981 52.0845 50.0482 48.2265 46.5772 45.0700 673.1923 539.6047 466.5890 437.1584 417.0813 401.7271 389.8165 380.2793 372.4528 365.9033
% each column = last column from each table

Siddharth Bhutiya
Siddharth Bhutiya on 4 May 2023
If the values in the common variables across all your tables are the same then this can be done with a simple outerjoin operation.
A = [1;2;3;4];
B = [4;5;6;7];
C = [10;20;30;40];
D = [11;22;33;44];
E = [1;2;3;4];
t1 = table(A,B,C)
t1 = 4×3 table
A B C _ _ __ 1 4 10 2 5 20 3 6 30 4 7 40
t2 = table(A,B,C,D)
t2 = 4×4 table
A B C D _ _ __ __ 1 4 10 11 2 5 20 22 3 6 30 33 4 7 40 44
t3 = table(A,B,E)
t3 = 4×3 table
A B E _ _ _ 1 4 1 2 5 2 3 6 3 4 7 4
t1 = outerjoin(t1,t2,MergeKeys=true)
t1 = 4×4 table
A B C D _ _ __ __ 1 4 10 11 2 5 20 22 3 6 30 33 4 7 40 44
t1 = outerjoin(t1,t3,MergeKeys=true)
t1 = 4×5 table
A B C D E _ _ __ __ _ 1 4 10 11 1 2 5 20 22 2 3 6 30 33 3 4 7 40 44 4

Stephen23
Stephen23 on 5 May 2023
The old-fashioned way:
D = {}; % data
H = {}; % header
fid = fopen('test_data_for_forum.txt','rt');
while ~feof(fid)
H{end+1} = fgetl(fid);
D(end+1) = textscan(fid,'%f%f%f%f','CollectOutput',true,'Delimiter','\t');
end
fclose(fid);
A = cat(3,D{:});
M = squeeze(A(:,end,:))
M = 10×30
166.5316 148.3423 134.8450 124.1679 115.2304 107.6222 101.0452 95.2869 90.1938 85.6509 22.1223 20.6401 19.8161 19.3659 18.9346 18.5474 18.2120 17.9137 17.6429 17.3934 610.7228 539.7700 524.0528 556.7579 582.9842 591.4124 601.4065 612.3104 623.7318 635.4279 157.8130 141.6124 129.2754 119.2830 110.8930 103.7800 97.5982 92.1624 87.3369 83.0192 32.9850 30.5492 29.0200 27.9242 27.1017 26.3746 25.7398 25.1721 24.6556 24.1795 591.9658 505.0126 464.8621 451.9789 447.0172 443.2228 441.4166 440.8735 441.1579 441.9953 150.7727 135.9954 124.6015 115.3060 107.4318 100.7428 94.9047 89.7525 85.1645 81.0480 41.0935 37.8983 35.8216 34.2973 33.0663 32.0474 31.1558 30.3581 29.6329 28.9655 592.7707 496.5848 448.2505 430.9639 418.8575 411.6989 406.7851 403.3426 400.9035 399.1700 144.9061 131.1890 120.5458 111.8247 104.4011 98.0740 92.5330 87.6284 83.2494 79.3110 47.5868 43.7749 41.2410 39.3541 37.8010 36.5253 35.4072 34.4070 33.4984 32.6637 600.8686 497.1515 443.3584 423.4690 409.6847 400.5365 393.8539 388.8132 384.9140 381.8356 139.9234 127.0240 116.9879 108.7436 101.7029 95.6865 90.4036 85.7161 81.5213 77.7410 52.9805 48.6579 45.7349 43.5343 41.7093 40.2067 38.8885 37.7091 36.6387 35.6567 612.0725 501.8093 443.6328 421.6066 406.6041 395.9663 387.9962 381.8210 376.9082 372.9149 135.6311 123.3815 113.8450 106.0013 99.2872 93.5395 88.4816 83.9847 79.9529 76.3130 57.5618 52.8089 49.5505 47.0759 45.0147 43.3110 41.8149 40.4765 39.2627 38.1502 624.4881 508.4519 446.5345 422.6620 406.5008 394.6432 385.6385 378.5678 372.8679 368.1750 131.8937 120.1732 111.0541 103.5505 97.1174 91.6029 86.7421 82.4131 78.5256 75.0107 61.5114 56.3907 52.8403 50.1245 47.8555 45.9728 44.3183 42.8382 41.4966 40.2682 637.1845 516.0189 450.8281 425.3265 408.0672 395.1659 385.2869 377.4682 371.1183 365.8536 128.6129 117.3315 108.5655 101.3535 95.1635 89.8527 85.1650 80.9843 77.2249 73.8214 64.9523 59.5138 55.7074 52.7780 50.3247 48.2822 46.4862 44.8796 43.4239 42.0920 649.6821 523.9312 455.8387 428.8845 410.6041 396.7885 386.1495 377.6854 370.7788 365.0279 125.7147 114.8033 106.3391 99.3788 93.4005 88.2682 83.7331 79.6839 76.0385 72.7344 67.9736 62.2579 58.2255 55.1064 52.4889 50.3035 48.3808 46.6609 45.1031 43.6787 661.7283 531.8577 461.1721 432.9149 413.6958 399.0684 387.7576 378.7258 371.3318 365.1573 123.1420 112.5459 104.3419 97.6004 91.8072 86.8320 82.4318 78.4993 74.9554 71.7403 70.6423 64.6832 60.4506 57.1623 54.3981 52.0845 50.0482 48.2265 46.5772 45.0700 673.1923 539.6047 466.5890 437.1584 417.0813 401.7271 389.8165 380.2793 372.4528 365.9033
plot(M)

Categories

Find more on Data Type Conversion in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!