<!DOCTYPE html>

Moss | An open source python library for master data management through machine learning.
 Moss Moss from the IT crowd
Python library for master data management through machine learning
Data can get messy. Moss can help.

MossML

An open source python library for master data management through machine learning. WARNING: MossML is under devlopment, check back later for more features

MossML Planned Features

Use Case Scenarios

Record Matching

1 table was given to the elves, 1 to the dwarfs and 1 to the kingdom of men. You being the blight of middle earth want one table to rule them all. A single source of truth across all records. But you got trouble. The elves stored full addresses while the dwarfs, preferring something shorter, just stored county name and a zipcode. The humans wanted to be creative, so the went and had address line 1 ,2 and 3 but entered the data into these 3 fields with near-zero consistency. What’s worse is that some of those uppity humans and hobbits appear in one table as Frodo Baggins and in another as The Ring Bearer. So you used MossML to train a matching algorithm to detect matching records across tables. The matching algorithm learns if two records are a match by being given examples of matching and non matching records. But! your army of orcs is busy waging war and can’t spend all their time looking for matching records to train the algorithm. Luckily MossML uses active learning to request specific training examples such that it can maximizes it’s rate of learning, requiring fewer examples to train the algorithm. Phew… being an evil necromancer has never been easier!

Schema Mapping + Record Matching

You are a dashing part-time venture capital consultant and you subscribe to multiple pitch deck databases. You think there is a lot of overlap between these databases and want to know if you can unsubscribe to one. The trouble is that the columns in these databases don’t match up, some of the columns don’t even make sense to you. Furthermore one DB has the company listed as Sunshine LLC and the Other DB has it as Sunshine Systems Inc. You don’t want to manually match up the records and excel doesnt quite cut it. MossML schema mapping and record matching are exactly what you want

Schema Mapping

You are a new employee at Uncontrolled IT System Sprawl Inc. Your boss wants you to get familiar with all of the databases, including the legacy systems and the systems from germany. The trouble is that column names could not be less helpful, Column 1, column 2, column 3 and you can’t understand the german column header at all. But you know that the columns must have an equivalent across the different databases. Schema mapping can help you with finding equivalent columns.

Approximate Record Search

You make frozen dairy treats. You want to remember some old product in your DB but all you recall is that it was discontinued before 2015 (or maybe it was 2014), sold exclusively in north america (might have been latin america) and you think it was called “Chocolate Supreme” or something like that. You do an approximate record search and find it was actually called “Wonder Brownie” where you then put your hand to forehead and say “Silly me !, glad MossML handles semantic similarity”

Approximate Record Search

You’re entering a new customer, Nate, into your database, but as your adding information, quick search shows that this customer is already in the system under the name Nathan. Hurray for repeat business!

Why Moss?

  1. Moss is Fuzzy (Matching)
  2. Grows even in the dirtiest environment
  3. IT crowd was a great show