C1TextParser

Introduction

Introduction

Extract data from plain text or html files that can then be stored in a table of records or transferred to another system. The C1TextParser library for .NET Standard enables you to efficiently integrate data from semi-structured sources, such as emails and invoices, into your work-flows.

Features

  • Sample Applications

  • Introduction

    Emails are a very common source of data for certain segments of a company (such as sales and marketing), and often data extraction is done manually. Anytime you receive an email that has a similar repeated structure a parser can be useful.  C1TextParser enables you to easily extract, store, and track this repeated type of data from emails. 

    Once this data has been extracted it can be stored (to build a table of relevant records) or passed on to another destination. CSV is another example of data that is often system structured though more often machine derived.

    The C1TextParser library includes three extractors for different scenarios: Starts-After-Continues-Until, Html and template-based. Extraction can occur along matched regular expressions, after a matched word or phrase, or using a defined script.

    Html Extractor
    The main purpose of this product can be seen as a way to automate the process of extracting relevant data about flights tickets, e-commerce receipts, etc, that we receive frequently in our e-mail client
    Starts After Continues Until Extractor
    This extractor was designed with the purpose of extracting relevant text from a plain text source. Specifically, whenever one pretends to extract all the text contained between the occurrence of two regular expressions
    Template Based Extractor
    A generic tool that allow to parse custom user data structures following any specified structure format

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Web;
    using System.Web.Mvc;
    
    namespace SamplesExplorer.Controllers
    {
        public partial class C1TextParserController : Controller
        {
            public ActionResult Intro()
            {
                return View();
            }
        }
    }
    
    <div>
        <div class="copy">
            <h3>@Html.Raw(Resources.C1TextParser.Intro_Text_Summary)</h3>
    
            <p>@Html.Raw(Resources.C1TextParser.Intro_Text1)</p>
            <p>@Html.Raw(Resources.C1TextParser.Intro_Text2)</p>
    
            <div class="collapsed-content collapse">
                <p>@Html.Raw(Resources.C1TextParser.Intro_Text3)</p>
    
                <dl class="dl">
                    <dt>@Html.Raw(Resources.C1TextParser.Intro_HtmlExtractor)</dt>
                    <dd>@Html.Raw(Resources.C1TextParser.Intro_HtmlExtractor_Text)</dd>
                    
                    <dt>@Html.Raw(Resources.C1TextParser.Intro_StartsAfterExtractor)</dt>
                    <dd>@Html.Raw(Resources.C1TextParser.Intro_StartsAfterExtractor_Text)</dd>
                                    
                    <dt>@Html.Raw(Resources.C1TextParser.Intro_TemplateBasedExtractor)</dt>
                    <dd>@Html.Raw(Resources.C1TextParser.Intro_TemplateBasedExtractor_Text)</dd>
                    
                </dl>
            </div>
            <p>
                <button type="button"
                        data-toggle="collapse"
                        data-target=".collapsed-content, .btn.btn-default.btn-xs.collapse"
                        class="btn btn-default btn-xs collapse in">
                    @Html.Raw(Resources.C1TextParser.Intro_More)
                </button>
            </p>
    
        </div>
        
    </div>
    @section Summary{
    <p>@Html.Raw(Resources.C1TextParser.Intro_Text0)</p>
    
    }