Big CSV file read in matlab

14 views (last 30 days)
Sudipta Das
Sudipta Das on 2 Jul 2024
Commented: Rik on 3 Jul 2024
I have a CSV file of 5 columns and unknown rows (all of these are number data), Its size is 45 GB.
I want to read it in MATLAB and plot it against the time.
Is there any way to do it ?
  3 Comments
Sudipta Das
Sudipta Das on 2 Jul 2024
How to do that ?
I am beginner with big data handling
Rik
Rik on 2 Jul 2024
I was putting it in a comment, then I realized it was actually an answer, so see below.

Sign in to comment.

Answers (2)

Aditya
Aditya on 2 Jul 2024
Hi Sudipta,
Handling a 45 GB CSV file in MATLAB can be challenging due to memory constraints. You can use MATLAB's functions such as "datastore" function to handle large files efficiently by reading and processing the data in chunks.
For detailed guidance, please refer to this MATLAB Answer: Loading very large CSV files (~20GB) - MATLAB Answers - MATLAB Central (mathworks.com)
Additionally, you can read more about the `datastore` function here: Getting Started with Datastore - MATLAB & Simulink - MathWorks India
I hope this helps!

Rik
Rik on 2 Jul 2024
You can use fgetl to read a single line. Do that in a loop to retrieve 5000 lines (or something like that). Then you do whatever sumarization you need (histcounts, 2D heatmap, whatever). Then you can continue with the next 5000 lines.
Note that fgetl returns a char vector you will have to parse yourself (e.g. with sscanf (yes, double s)). You might also want to check what happens with the last chunk (which is unlikely to be exactly your chunk size).
The beauty of this system is that you store intermediate values, so you can pause and continue if you track the number of completed chunks.
  1 Comment
Rik
Rik on 3 Jul 2024
Did you try this solution already? I couldn't easily tell what exactly you edited in the question.

Sign in to comment.

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!