MongoDB World 2017: Building Micro-Services Based ERP System

Jerry M Reghumadh of Capiot did a talk on building micro-services from the ground up.  The legacy system that his group replaced was monolithic and rigid.  The solution the Capiot team proposed to the client placed each component of the ERP system into its own atomic component.  Everything on the platform that was built was an API.  These API’s were very “chatty”.

The engineering decisions that were made included the choice of NodeJS and MongoDB as the base technologies for this platform.  NodeJS was selected in part, because of its small footprint.  This lowered the barrier to entry for the application.  Java was considered, but it was too heavy for the needs of the project.  MongoDB was selected for the data persistence layer because it saves data as documents and it did not require the marshaling and unmarshaling of data.  MongoDB also allowed the implementation team to use a flexible schema.  MongoDB offered greater ease of clustering and sharding versus other available options for this project.  This allowed the developers to implement this without relying on a dedicated database administrator.

The technology stack included:

  • NodeJS
  • ExpressJS
  • Mongoose.js
  • Passport.js

The team implemented a governance model that forced any exposed API to be exposed in Swagger.  This prevented the proliferation of “rogue” API’s.  Any API not exposed in swagger would not work properly in the system.  Mongoose allowed the team to enforce a schema.


MongoDB World 2017: Using R for Advanced Analytics with MongoDB

Jane Uyvova gave a talk on analytics using MongoDB and the R statistical programming.  She began her talk by discussing analytics versus data insight.  R has become a standard for analyzing data due to its open source nature and easy licensing requirements versus some legacy tools, such as SAS or SPSS.

Use Cases

  • Churn Analysis
  • Fraud Detection
  • Sentiment Analysis
  • Genomics

Use Case 1: Genomics

The human genome consists of billions of gene pairs.  The dataset that was used came from HAPMAP.

  • HapMap3 was the dataset
  • Bioconductor was the R library that was used for this analysis
  • R-Studio was used for the analysis
  • MongoLite connector

The MongoDB data aggregation framework was used to aggregate the data by region.

In doing genomic analysis, schema design becomes important in making the analysis easier and more effective.

Use Case 2:  Vehicle Situational Awareness

  • Chicago open data was used as the dataset
  • The dataset was loaded into MongoDB and Compass was used for the initial analysis
  • R was used to analyze the data.  R was used to extract data for a density plot (GG-Plot)
  • The MongoDB flexible schema allows a wide variety of data to be included in the analysis

One issue that must be addressed is scalability.  Since R is a single-threaded application, data scientists come up against data volume constraints.  One solution to this is to use Spark to parallelize and scale R.

A MongoDB/Spark architecture can include an operational component.  This operational component consists of an application cube and a MongoDB driver.  The data management component consists of the MongoDB cluster.

MongoDB: Migrating from MongoDB on EC2 to Atlas

Atlas was introduced by MongoDB as their SAAS offering for MongoDB.  Atlas allows administrators, developers, and managers to deploy a complete MongoDB cluster in a matter of minutes.  Some basic requirements for using Atlas include:

  • Atlas requires SSL
  • Set up AWS VPC peering
  • VPN and Security Setup
  • Use Amazon DNS (when on AWS)

The preparation work that must be done includes:

  • Picking a network CIDR that won’t collide with your networks
  • Need to use MongoDB 3.x engine using WiredTiger
  • Test on replicas using testing/staging environments

Atlas supports the live migration of data from an EC2 instance.  The Mongo Migration Tool or MongoMirror can be used to migrate the data.

MongoDB World 2017: Video Games

Jane McGonigal’s keynote talk started by highlighting that 2.1 billion people around the world play video games.  12 billion hours a week are spent playing video games.  There is something that is energizing about gameplay.  A scan of the brain has shown that the opposite of play may be depression.

Gameplay seems to increase activity in the hippocampus portion of the brain.  One interesting fact is that gamers fail at gameplay 80% of the time.  In spite of this, gamers tend to activate the ability to learn.

Jane does game research and analyzed the psychology of gameplay.  Pokemon Go was fastest downloaded app  in the history of apps.  There were 500 million downloads in 30 days.  Why was this game so popular?  Pokemon go elicits a sense of opportunity for the players and engages users.  When 650 million people start walking around playing this mobile game, a lot of data is generated.  There have been analyses done around the statistics around the usage of this game.

Augmented Reality may be the most compelling experience and platform for gaming vs. virtual reality.  The lessons learned from gaming data are:

  1. People want to engage the real world in an interesting way
  2. Games will be a huge driver of data collection

The app called Priori listens to a person’s voice to determine a person’s mental state.  Emotionant is a technology that determines a person’s mood based on their facial expression.  Emotiv is a sensing device that can detect emotions.  Mooditood is a social network to share how people feel.

MongoDB 2017 Keynote

The keynote was given by Megan and Richard in the Hyatt Grand Ballroom.  Megan stated that there will be six tutorials.  MongoDB World is paperless this year, so a mobile app will be used to find sessions, tutorials and give feedback.

Tom Schenk is the first keynote speaker.  (CTO City of Chicago).  WindyGrid is a system built for the city of Chicago that allows city staff to analyze and visualize data piped into the system from 17 different input applications.  MongoDB drives this application.  Data that is sent to the system includes weather, 911 calls, 311 calls, etc.  Predictive analytics are used to drive the operations of the city.  Data is captured and managed in MongoDB. The code for the predictive analytics for food safety has been released as open source code.  (

Tom transitioned by introducing the CEO of MongoDB, Dev Ittycheria.  Dev started by going over the history of the internet and technology from the introduction of Netscape in 1995.  Around 2000, fiber optic communications were built out and made it cost effective for technology work to be distributed around the world.  2007 was a watershed year because the cost of computing power and storage dropped to a point where many new types of business concepts became feasible.  MongoDB was founded in 2007.

2007 document model, 2010 distributed framework, 2012 aggregation framework, 2013 Management and security, 2014 WiredTiger, 2015 Doc validation, 2016 MongoDB Atlas.

Dev talked about the growth of MongoDB over the last few years and how widespread the platform has become.  Last year there was a 30% increase in people taking MongoDB University classes.

Shawn Melamed (Morgan Stanley).  Shawn stated that Morgan Stanley started using MongoDB 5 years ago.  One of the attractions of the database is the flexibility of how the data can be structured.

2017 –  Elliot  Horowitz announcing new products and services coming out of MongoDB

  • Business Intelligence – (Note: Tableau 3.0 came out that allows this visualization tool that allows to easily connect to MongoDB)
  • MongoDB Charts has been announced which allows charts to be built in a document-centric view.  This allows developers and analysts to quickly build dashboards based on MongoDB.  The tool has an intuitive drag and drop interface and allows users to build common types of charts (bar charts, pie charts)  This feature will be part of MongoDB 3.6 and will be released later this year.
  • In MongoDB 3.6, the $lookup operator has been enhanced to allow more advanced and sophisticated queries.  Enhancements have been added to updating arrays
  • Fully expressive array updates
  • Schema ( schemaless, semi-structured, unstructured ).  MongoDB 3.6 will support JSON Schema.
  • Retryable writes – this guarantees that a write will happen at least one time and only one time if desired.
  • In 3.6, it will only be possible to connect to localhost by default.  This should close some current security holes in the product.
  • Change streams in action
  • MongoDB Stitch (NEW Service) – Backend as a service
    • REST API MongodDB
    • Configuration based authentication, privacy, and security
    • Service composition
    • Anywhere you want
    • Available on MongoDB Atlas today in beta

MongoDB 3.6 is scheduled to ship in, November 2017.

  • MongoDB Atlas is the recommended way of running MongoDB in the cloud.
    • Free Tier
    • Performance Tier
    • Queriable backups
    • Supported on AWS
    • Support added for Microsft Azure and Google Cloud
    • Hosted BI Connector
    • LDAP authentication  allows the connection with on-premise enterprise systems
    • Cross-regions support (coming next year)
    • Cross-cloud support (coming next year)

Cailin Nelson, Vice President, Cloud Engineering spoke and announced support for Microsoft Azure and Google Cloud in addition to AWS.  Atlas has been optimized for each cloud platform.  The new performance advisor allows administrators and developers to track down performance issues easily.

What has changed since 2007

  • The Web as a 1st class citizen
  • Mobile
  • IoT

Polymorphic data can appear in the same collection withing MongoDB, which is pretty powerful.

The hashtag #MDBW17 will be used to post announcements.

MongoDB World 2017 Chicago Opening

I’m in the Chicago Hyatt Ballroom waiting for the keynote, at MongoDB World 2017, to start.  It appears that this will be a good conference with a lot of informative breakout sessions.  It will interesting to see what product announcements and programs are introduced during the show.  There appears to be a sizable number of attendees at this year’s conference.  I will post any significant announcements from the show as I learn of them.

New posts will appear on Twitter and more detailed commentary will be made on this blog.