Read equations from MS Word

20 views (last 30 days)
Dear all,
I am trying to import all the formulas in a Word document into MATLAB (plain text). The goal would then be to make a script that computes these imported formulas with the data I have.
I have been looking for a solution without any luck. I have tried with actxserver, but it seems like the wdoc.OMaths.ConvertToNormalText function does not exist in MATLAB.
word = actxserver('Word.Application');
wdoc = word.Documents.Open("C:\Users\pehlivanlar\Documents\test.docx");
size(wdoc.OMaths)
wdoc.OMaths.ConvertToNormalText
Unrecognized function or variable 'ConvertToNormalText'.
The original equations were created using Equation Editor 3.0, but I can easily convert them to Office Math ML Format:
Thanks in advance for the help.
Br
  2 Comments
Daniel Neubauer
Daniel Neubauer on 4 Nov 2022
is it possible to save your .docx file as a plain text file with *.m extension? then you could just run it
Shivam Malviya
Shivam Malviya on 7 Nov 2022
Hi Benjamin,
Could you share the word document that contains the equation?
Thanks,
Shivam

Sign in to comment.

Accepted Answer

Shivam Malviya
Shivam Malviya on 7 Nov 2022
Hi Benjamin,
According to my understanding, you are trying to do the following;
  • Convert word document containing equations into plain text.
  • Perform operations on the data according to the equations in the text file.
The word document file that contains equations is as below;
x.^1.5 + log10(x)
2*x + cos(x)
For converting the word document into a text file, the following script can be used;
% File paths
wordFile = [pwd filesep 'test.docx'];
txtFile = [pwd filesep 'test.txt'];
% Start the word application
word = actxserver('Word.Application');
% Get the document handle
document = word.Documents.Open(wordFile);
% Convert to text file
document.SaveAs2(txtFile, 2)
% Close the document
document.Close;
% Close the application
word.Quit;
% Print statement
disp("Converted successfully to text!!");
For operating on the data, according to the equations in the text file, the following script can be used;
% Import the text file that contains the formula
strEquations = fileread('test.txt');
% Process the string from the text file
strEquations = strsplit(strEquations, newline);
% Create a function handle that converts a string to a function handle
strToFuncHndl = @(x) str2func(['@(x)' x]);
% Convert each string equations into the function handles
equations = cellfun(strToFuncHndl, strEquations(1:end-1), 'UniformOutput',false);
% Create data
x = 1:10;
% Input data to the equations
disp("Output to the first equation");
equations{1}(x)
disp("Output to the second equation");
equations{2}(x)
Hope this helps.
Thanks,
Shivam Malviya
  2 Comments
Benjamin Pehlivanlar
Benjamin Pehlivanlar on 7 Nov 2022
Hi Shivam,
Thanks alot for your quick answer!
I've just seen it now and for confidentiality reasons, I couldn't share the .docx file anyway, so I would have made a separate file. But your solution works, and I haven't actually thought that it would've been as trivial as saving the .docx into a .txt file! Elegant solution!
However, I used the funciton
splitlines(strEquations)
instead of
strEquations = strsplit(strEquations, newline);
It is more suited for longer equations that might be truncuated (at least in my case).
Best regards,
Benjamin
Benjamin Pehlivanlar
Benjamin Pehlivanlar on 8 Nov 2022
Hi Shivam,
One more question: some documents contain the equations with an older format (see screenshot), and these one do not show up when converted to .txt. Is there an easy way to convert all of them into Office Math ML?
Thanks
Br

Sign in to comment.

More Answers (0)

Categories

Find more on Characters and Strings in Help Center and File Exchange

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!