Line Detection using Multiple Object Tracking

Include <scribo/segdet.hpp>

std::tuple<mln::image2d<std::uint16_t>, std::vector<LSuperposition>> detect_line_pixel(const mln::image2d<std::uint8_t> &input, int min_len, const SegDetParams &params);

std::vector<VSegment> detect_line_vector(const mln::image2d<std::uint8_t> &input, int min_len, const SegDetParams &params);

std::tuple<mln::image2d<std::uint16_t>, std::vector<LSuperposition>, std::vector<VSegment>> detect_line_full(const mln::image2d<std::uint8_t> &input, int min_len, const SegDetParams &params);

Compute a line detection with a complete pipeline using MOT in a greyscale image.

Parameters:

input – The input greyscale image
min_len – The minimum length (in pixels) of segments that have to be detected
params – The Parameters struct giving the parameters of the method.

Input

enum class scribo::e_segdet_preprocess

e_segdet_preprocess Precise the preprocess to apply

Values:

enumerator NONE: None.

enumerator BLACK_TOP_HAT: Black-Top-Hat with specific filter size and dynamic.

enum class scribo::e_segdet_process_tracking

e_segdet_process_tracking Precise which tracker is used

Values:

enumerator KALMAN: Kalman Filters following classics prediction and correction based on IRISA article.

enumerator ONE_EURO: One Euro Filter (modification from Nicolas Roussel code)

enumerator DOUBLE_EXPONENTIAL: Double exponential.

enumerator LAST_INTEGRATION: Last observation.

enumerator SIMPLE_MOVING_AVERAGE: Simple moving average.

enumerator EXPONENTIAL_MOVING_AVERAGE: Exponential moving average.

enum class scribo::e_segdet_process_extraction

Values:

enumerator BINARY: Binary extraction with threshold.

enumerator GRADIENT: Gradient extraction with threshold.

enum class scribo::e_segdet_process_traversal_mode

e_segdet_process_traversal_mode Precise the traversal performed for line detection

Values:

enumerator HORIZONTAL: Only horizontal traversal is performed.

enumerator VERTICAL: Only vertical traversal is performed.

enumerator HORIZONTAL_VERTICAL: Both horizontal and vertical traversal are performed.

struct SegDetParams

SegDetParams holds parameters of the line detection.

Public Functions

bool is_valid() const

Say if parameters values are compatible.

Returns:: true if parameters are valid

Public Members

e_segdet_preprocess preprocess = e_segdet_preprocess::NONE : Preprocess applied.

e_segdet_process_tracking tracker = e_segdet_process_tracking::KALMAN : Tracker used.

e_segdet_process_traversal_mode traversal_mode = e_segdet_process_traversal_mode::HORIZONTAL_VERTICAL : Traversal performed.

e_segdet_process_extraction extraction_type = e_segdet_process_extraction::BINARY : Extraction type for observations.

bool negate_image = false: Say if image has to be reversed before processing.

float dyn = 0.6f: Dynamic when Black-Top-Hat preprocess is applied.

int size_mask = 11: Filter size when Black-Top-Hat preprocess is applied.

float double_exponential_alpha = 0.6f: Alpha used in double exponential tracker if chosen.

float simple_moving_average_memory = 30.0f: Memory used in simple moving average tracker if chosen.

float exponential_moving_average_memory = 16.0f: Memory used in exponential moving average tracker if chosen.

float one_euro_beta = 0.007f: Beta used in one euro tracker if chosen.

float one_euro_mincutoff = 1.0f: Min cutoff used in one euro tracker if chosen.

float one_euro_dcutoff = 1.0f: Dcutoff used in one euro tracker if chosen.

int bucket_size = 32: Bucket size during traversal.

int nb_values_to_keep = 30: Memory of tracker to compute variances for the matching.

int discontinuity_relative = 0: Percentage. Discontinuity = discontinuity_absolute + discontinuity_relative * current_segment_size.

int discontinuity_absolute = 0: Discontinuity = discontinuity_absolute + discontinuity_relative * current_segment_size.

int minimum_for_fusion = 15: Threshold to merge trackers following same observation.

int default_sigma_position = 2: Position default variance value.

int default_sigma_thickness = 2: Thickness default variance value.

int default_sigma_luminosity = 57: Luminosity default variance value.

int min_nb_values_sigma = 10: Threshold to compute variance and not use defauld values.

float sigma_pos_min = 1.f: Minimum position variance value.

float sigma_thickness_min = 0.64f: Minimum thickness variance value.

float sigma_luminosity_min = 13.f: Minimum luminosity variance value.

int gradient_threshold = 30: Gradient threshold when gradient preprocess is applied.

int llumi = 225: First threshold for observation ternary extraction.

int blumi = 225: Second threshold for observation ternary extraction.

float ratio_lum = 1.f: Ratio of kept luminosity in observation extraction.

int max_thickness = 100: Max allowed (vertical|horizontal) thickness of segment to detect.

float threshold_intersection = 0.8f: Threshold for duplication removal.

bool remove_duplicates = true: Say if duplication removal has to be computed.

Outputs

struct VSegment

VSegment structure holding vectorial information about detected lines.

Public Members

int label: Label of segment.

int x0: First coordinate of first point.

int y0: Second coordinate of first point.

int x1: First coordinate of second point.

int y1: Second coordinate of second point.

struct LSuperposition

LSuperposition structure holding superposition information.

Public Members

int label: Label of the segment superposing.

int x: First coordinate of the position of the superposition.

int y: Second coordinate of the position of the superposition.

Definition

Retrieving straight or slightly curved lines out of document images can be a essentiel step of document analysis. This method retrieves such lines by iteratively predicting the position of spans of pixels in the input image columns. In order to do so, it makes use of Kalman filters to integrate the observed measurements and determine what is and what is not a segment.

Usage

Retrieve lines in document images

    int                  min_len = 20;
    scribo::SegDetParams params  = {.traversal_mode         = scribo::e_segdet_process_traversal_mode::HORIZONTAL,
                                    .discontinuity_relative = 30,
                                    .max_thickness          = 5};
    auto [labelled_image, superposition_vector, vsegment_vector] = scribo::detect_line_full(input, min_len, params);

Build vectorial output image

    mln::image2d<argb8> out_vector(input.width(), input.height(),
                                   mln::image_build_params{.border = 0, .init_value = {}});
    mln::transform(input, out_vector, [](int x) { return x != 0 ? argb8{255, 255, 255, 255} : argb8{0, 0, 0, 255}; });
    render_image_vector(out_vector, vsegment_vector, output_filepath_vector);

Build pixel output image

    mln_foreach (auto pt, labelled_image.domain())
      labelled_image(pt) = labelled_image(pt) == 0 ? (input(pt) != 0 ? 1 : 0) : labelled_image(pt);
    mln::io::imsave(mln::view::transform(labelled_image, [](auto x) { return regions_lut(x); }), output_filepath_pixel);

Display superposition information

    for (auto superposition : superposition_vector)
      fmt::print("label={}; x={}; y={}\n", superposition.label, superposition.x, superposition.y);

(Full code: /snippets/segdet.cpp)


Input image	Scribo lines detected vector (super-imposed)	Scribo lines detected pixel (super-imposed)

References

[Lep95]

Leplumey, Ivan, Jean Camillerapp, and Charles Queguiner. “Kalman filter contributions towards document segmentation.” Proceedings of 3rd International Conference on Document Analysis and Recognition. Vol. 2. IEEE, 1995.