Detect lines in scatter plot

8 views (last 30 days)
Douglas Anderson
Douglas Anderson on 15 Jan 2019
Commented: Douglas Anderson on 27 Jul 2020
Hello,
I have, as an example, the following data plot:
The Y data are in integer increments, so the "lines" of data horizontally aren't really lines. The X data are also integral, but sampled much more finely. The X value of the vertical lines is what is relevant. They are not perfect lines though!
For example, the vertical line on the left (X = ~80) has a jog between Y = 10 and Y = 9; there are gaps on the third line (X = ~ 158) at Y = 16, and from Y=11 to Y=13; and there are doublets on several lines, like the far right line (X = ~613) at Y=16 and Y=17.
I have thought of Hough transform, but that seems like overkill, and I don't have Image Processing Toolbox. This really isn't a pattern recognition problem, is it? I do have Signal Processing Toolbox, but not sure if anything applies there.
I have 2018b, but am using this on 2017a. Am I missing something?
Thoughts are welcome! Thanks!
Doug Anderson
PS: The scatter in the X values for a given line varies from about 2 to 4, with separation of about 30 or 40 or so between lines. This is, of course, excluding the jog mentioned, and the doublets.
  3 Comments
Image Analyst
Image Analyst on 26 Jul 2020
What line(s) are you hoping to find? Can you draw a red line over the blue circles that shows the line(s). And what does "detect" mean to you? Do you want the coordinates and indexes of the points that are within some threshold distance of your line? Do you want the slope and intercept of the equation of the line(s)?
In the "doublet" cases, are both of those points considered to be on the line (whatever that means) or to be considered as input as to what determined the line? Or do you only want the points that are more in-line with the majority of the points?
Do you have the Statistics and Machine Learning Toolbox? If so you can use kmeans() to determine how many vertical groupings there are and which blue circles are in which group.
Douglas Anderson
Douglas Anderson on 27 Jul 2020
Hello!
I am attaching a figure with the red marker as you requested. All of the lines are essentially subparallel to the y-axis. Sometimes there are a lot of blue circles at the bottom, related to the sensitivity. But as you suggested, I only want the points that are more in-line with the majority of the points.
Note that even if it appears that the lines are strictly vertical, you can see from the horizontal scale that in fact there are slight variations in what seems to be vertical. Both scales are integers, but a hell of a lot more in the horizontal than the vertical.
I do not currently have either the Statistics and Machine Learning Toolbox for kmeans() or the Computer Vision Toolbox for ransac(). Would one be preferable over the other?
Thank you to both of you for your suggestions!
Doug

Sign in to comment.

Answers (0)

Products


Release

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!