Astellent | Our Insights

October 25, 2018 — Comments are off for this post.

What Happens in a Machine Learning project?

Cross-posting the blog post that our CTO Hitesh Parashar wrote for "Machine Learning for Everyone" at "What Happens in a Machine Learning Project?".

Machine Learning follows a simple flow like a human brain to continuously enrich what the machine knows and applies it to make better decisions. It is a continuous refinement process since it keeps getting better with every learning iteration. The more data we feed to these iterations the better inferences the machine can make.

In this post we will walk you through the fundamentals of how a typical machine learning framework works.

There are primarily three stages in a machine learning project. First, it is the non-sexy part of machine learning where you bring in and prepare your data. Second, you experiment, train and evaluate your machine learning model (we will learn about the model in upcoming posts), and last but not the least you deploy your trained model to a production network to solve the problem you started with. The data coming out of the production deployment is fed back to our data wranglers to provide feedback and improve the model.

Let’s dig deeper into each one of these stages of a machine learning project by using the same example we started with in our first blog post to find out how much is my house worth?

Stage 1: Data Wrangling

There are three parts of data wrangling -

Fetch Data

Collecting data from myriad systems is not the most exciting part of a machine learning project. Your data typically lives in your data centers, or out in the world with your cloud hosted applications or partners. This is where you setup these connections with systems that contain the source data that you want to run your algorithms on. The more you automate collection of this data, the better it is since doing it manually, again and again is extremely boring and inefficient.

For our house prices example, we will go out and collect house price data along different attributes like square footage of homes, year of construction, location, school ranking, proximity to amenities, etc. This can be collected from recently sold homes or currently listed homes from organizations like MLS. Tactically, this will mean downloading or scraping files containing data in excel or comma separated value formats. Some organizations provide APIs (Application Programming Interfaces), that is a fancy way to call windows to their data.

Clean Data

What you get from your source data systems is not always ready for feeding to the rest of your pipeline. This is the stage where data scientists have to figure out the translation of raw, dirty, unusable data stream to what a machine can understand. They go through setting up rules that remove unwanted pieces of data and change the format of pieces that do not make sense in the current form.

The data that we collect for house prices, mostly, was meant for consumption by human beings or reasons other than analyses. Files that we collect always contain rows and columns of information not relevant or simply there since the format was not meant for consumption by a program. This stage includes scrubbing all of those unwanted pieces removed from the files that we will consume in later stages.

Transform data

In this stage you massage and bring the data elements close to what a machine learning algorithm can understand. Machine learning models are only as good as the data that is used to train them. A key characteristic of good training data is that it is provided in a way that is optimized for learning and generalization.

In the final stage of data wrangling for house prices, you convert data like prices in millions to dollars, addresses to geolocation coordinates, number of bath rooms in half as 2.5 or 3.5 instead of “2 and a half”. After these transformations, the data will be readable by our algorithms.

This data set is also divided into two parts at this stage, one part used for training and the other part used to verify if the machine learning algorithm actually learnt what it was supposed to.

Stage 2: Experimentation

After data is ready in a digestible format comes the heart of a machine learning project, training and evaluating models. What is a model by the way?

A model is the output that comes out of the training stage that captures the patterns that are used to answer your question.

Train model

In this stage you feed data to the training algorithm and out comes a modelthat captures the patterns in this data to answer your question. In this stage you also specify the training parameters that control this training process. You also take multiple passes over this data set that refine the model and find progressively better patterns.

Once our house prices data is massaged, it is passed to our learning algorithms. You also specify here training signals, for example, the number of passes to take through this data set, maximum size of the model, and ways to remove biases for extreme data sets (that multi-million dollar huge home sold recently in the neighborhood that was not the typical home around).

Evaluate model

After your model is output from the training stage, you need to make sure it actually is doing the job correctly. This stage is to find out the success or failure of the model. You need a new set of data which was not used for training to verify if the model produced is actually doing the right thing.

This is where we use the second part of the house prices data that was collected and separated out to verify the success of the algorithm. The model that was output from the training stage is now tested against this second dataset to make sure our machine did it’s job.

Stage 3: Deployment

Now that you have a trained and verified model you can use it to actually put it where it starts answering your questions. e.g. if we trained a model to answer a question whether a given email is spam or not then in this stage you will start including this model with your email clients to start separating out spam emails from the good ones.

Integrate

Training and evaluation of models happen in a vacuum, where the realtime data is not touching the machine learning algorithms. Integrating with actual system where this output will be used, is where you start doing things at a scale which requires response within certain time period to actually answer the question that we started with.

In case of our house prices example, integrating might mean putting this model to work within a website that predicts the value of your home given the address of the home.

Monitor & refresh

Machine learning is an iterative process. It does not end with just one cycle. You need to remain on top of how your algorithms are performing, debug and fix problems, make adjustments as needed to the training data that you used and keep feeding new scenario to the data wrangling team from your production environment. Refreshing data is the secret sauce of what makes machine algorithms getting better over time just like human beings do with the wisdom of having been in different situations multiple times.

Machine learning is all about striving to get better over time. By monitoring and annotating the accuracy and biases of predictions of house prices, data scientists feed this back to the data preparation and parameter tuning that makes these predictions get better iteration after iteration.

Conclusion

There is a pattern to how the machine learning projects are executed. It is very analogous to how human brain works by collecting, cleaning, analyzing and learning from the data around us. Most of the machine learning project involves hard, tedious and unglamorous work of data preparation, cleaning and transformation. Hope this blog post helped you get a sense of the pipeline that a typical machine learning project goes through.

In our future posts, we will take you through actual implementation of these steps and examples using frameworks like tensorflow, MXNet, Caffe and torch.

October 12, 2018 — Comments are off for this post.

Astellent Awarded Red Hat Western Leading Edge Partner of the Year

San Francisco, CA. October 12th, 2018 – Astellent announced today that it has been recognized by Red Hat as the Western Leading Edge Partner of the Year at this year’s Red Hat North America Partner Conference. The Red Hat Leading Edge Partner of the Year award honors partners that use open source technologies to create change.

“We are proud of our strong collaboration with Red Hat,” said Erik Melander, CEO at Astellent. “No matter your industry or size, building software that matters has never been more important. We are honored to help organizations achieve more with the essential open source technologies for digital transformation.”

ABOUT ASTELLENT

Astellent is a consulting company that improves how software is designed, developed, and delivered to help organizations achieve more. We believe the future will be built with software, data, and hard work. We collaborate with ambitious people who use cloud services and machine intelligence to make things that matter. Let’s work together. Visit us at https://www.astellent.com.

May 22, 2018 — Comments are off for this post.

Astellent Joins NVIDIA Inception Program

San Francisco, CA. — May 22nd, 2018 — Astellent today announced it has joined the NVIDIA Inception program, which is designed to nurture startups revolutionizing industries with advancements in AI and data sciences.

Astellent is an Artificial Intelligence consulting company that offers solutions for data strategy, data engineering, data science, and machine learning. As a member of the NVIDIA Inception program, Astellent now has direct access to NVIDIA's latest technology, deep learning expertise, and a global network of partners and customers.

“We are proud of our strong collaboration with NVIDIA,” said Erik Melander, CEO at Astellent. “NVIDIA's GPU-accelerated platforms are essential to satisfy the computing required to develop and deploy advanced Artificial Intelligence models. We believe that NVIDIA's network of innovators and its state-of-art GPU technology will help our customers achieve more with data and Artificial Intelligence."

NVIDIA’s Inception program is a virtual accelerator program that helps startups during critical stages of product development, prototyping and deployment. Every Inception member gets a custom set of ongoing benefits, from hardware grants and marketing support to training with deep learning experts.

About Astellent
Astellent is a consulting company that improves how software is designed, developed, and delivered to help organizations achieve more. We believe the future will be built with software, data, and hard work. We collaborate with ambitious people who use cloud services and machine intelligence to make things that matter. Let’s work together. Visit us at https://www.astellent.com.

May 17, 2018 — Comments are off for this post.

Astellent Achieves Advanced Consulting Partner Status in The Amazon Web Services Partner Network

San Francisco, CA. May 17th, 2018 — Astellent announced today that it is recognized as an Amazon Web Services (AWS) Advanced Consulting Partner in the AWS Partner Network (APN). APN Consulting Partners are professional services firms that help customers design, architect, build, migrate, and manage their workloads and applications on AWS. The AWS Advanced Consulting Partner tier highlights top AWS Consulting Partners globally that have invested significantly in their AWS practice. They have extensive experience in deploying solutions on AWS with a team of accredited and certified technical consultants.

“We are proud of our strong collaboration with Amazon Web Services,” said Erik Melander, CEO at Astellent. “AWS powers tremendous customer innovation. Working together, we can build solutions engineered for our customers' most demanding requirements.”

ABOUT ASTELLENT

May 10, 2018 — Comments are off for this post.

Top 3 Highlights From Red Hat Summit 2018

Astellent team was busy meeting with amazing customers and partners at Red Hat Summit 2018 in San Francisco from May 8 to May 10 2018. Here are the three higher order bits that we captured.

I. Red Hat as the simplifying force for the new chaotic world of IT

Customers in varying industries are facing fundamental questions to transition to the new digitized world. They have myriad options in front of them.

They need to handle different environments within their data centers and with the new public and private clouds including AWS, Google Cloud and Azure. They have their workflows running within different operating systems including Linux, Microsoft, Unix and others. There are new languages and platforms developers want to use for different faces of innovation - that includes Go, Python, Java, C++, Swift and many others.

Red Hat has emerged as the glue and the simplifying force within these software organizations now.

Red Hat OpenShift platform combined with Istio provide a complete package on top of Docker and Kubernetes to allows a hybrid cloud to flourish. Now, you can build applications that are agnostic of where you run them. With OpenShift you can allow developers with very different skills and preferences to keep doing what they do best while getting free from the infrastructure related blockers that used to slow them down.

Every industry needs to have an API strategy now as they configure themselves to fit in the new digitized ecosystems. 3Scale is one of the best platforms to manage your API infrastructure and make it scalable, flexible and future proof.

II. Solidifying major partnerships with Microsoft and IBM

Things have changed drastically since Microsoft and Linux were two words we could not include together. The new Microsoft and the new Red Hat have a very deep partnership now which grew deeper this year. Red Hat and Microsoft are now working together to build one of the best solutions for container management, bringing Kubernetes and Azure together with OpenShift.

The new OpenShift release offers software organizations the ability to manage applications across both public cloud and on-prem using a simple set of tools. Microsoft SQL Server is much more deeply integrated within OpenShift now.

IBM and Red Hat are closer now than ever before due to their work combining containers and hybrid cloud platforms. Now, IBM Cloud Private can deploy containers across Red Hat portfolio. The new combination is fueling cloud services for AI, blockchain and IoT. IBM PowerAI, IBM's deep learning toolkit, will also be available on Red Hat's operating system.

III. Making things that matter with open source software

Change is easy. Progress is hard. Progress relies on those that see the world as it is, but dare to imagine something better.

Attending Red Hat Summit leaves us humbled by the stories of people pushing us forward and making a difference with open source software. Open source software is an amazing intersection of community and code. It is at the center of people who are building a better world. In particular, we loved the story about the School Mapping Project by Erica Kochi and the teams at UNICEF Innovation and Red Hat.

We are proud to be part of this community. Thank you for inspiring us and challenging us to do more and to make a difference.

May 7, 2018 — Comments are off for this post.

Astellent Named Red Hat Premier Business Partner

Achieves Red Hat’s Cloud Infrastructure, Data Center Infrastructure, and Middleware specializations.

San Francisco, CA. May 2nd, 2018 — Astellent announced today that it achieved Red Hat Premier Business Partner status, joining a select group of Premier partners in the Western United States. Red Hat Premier Business Partners are well trained and highly committed to working closely with Red Hat on business opportunities.

In addition to its Premier status, Astellent has also achieved Red Hat’s partner specializations in Cloud Infrastructure, Data Center Infrastructure, and Middleware Solutions. The specializations demonstrate the deep commitment of Astellent engineers to enable adoption of Red Hat’s entire portfolio of open hybrid cloud solutions.

“We are proud of our strong collaboration with Red Hat,” said Erik Melander, CEO at Astellent. “Red Hat technology is at the center of many digital transformation initiatives. Working together, we collaborate with organizations using open source software to change the way applications are built, and businesses are run.”

“Red Hat’s partner ecosystem is a vital component in delivering powerful, flexible and open solutions to our enterprise customers,” said Matt Simontacchi, vice president of North American Sales at Red Hat. “We welcome Astellent as a Red Hat Premier Business Partner and are excited to collaborate on our customers' digital transformation projects, helping them to realize the many business benefits of open source including increased speed and agility, competitiveness and participation.”

ABOUT ASTELLENT

Red Hat is a trademark or registered trademark of Red Hat, Inc. or its subsidiaries in the U.S. and other countries.

April 4, 2018 — Comments are off for this post.

Why Did We Start Astellent?

The world is going through an unprecedented level of change in many different dimensions right now. Having toiled hard for the last two decades as builders and implementers of software applications and tools, it is extremely gratifying to see what is possible to build today due to the advances in cloud services and machine intelligence. Software is eating the world and data along with cloud services are adding spices to that meal.

The traditional “non software” companies like the financial, transportation, retail and healthcare companies are compelled now to go make changes that were meant only for the “new software companies”. There are ambitious people within these organizations who don’t just want to embrace this change but want to lead it in a way that no-one has done before. Astellent was started to work with these ambitious people to make what matters.

Machine Learning/Deep Learning advances are forcing engineering organizations to unlearn procedural algorithmic instructions and learn how to develop software using models and data inferences. Cloud is fundamentally changing the speed, processes, and economics to deliver software. Microservices, containers, and DevOps technologies are making a huge difference in how software is designed and packaged.

There is a gap in the marketplace right now. No one company is holistically helping businesses fill white spaces and integrate different pieces of infrastructure, tools, processes and organizational change management. Astellent aims to fill that gap with our expertise in AI/ML/DL, Software Engineering and Cloud skills to fill white spaces and create an end-to-end integrated approach, with our skills in leading people behavior change management in technical organizations.

We have developed tools, templates and methodologies to bring the best out of your developers, ops teams, data scientists and security teams. We built these with our partners who we respect and follow with all our heart. We have partnership in our DNA to allow us to bring the best of breed solutions from our partners like Red Hat, AWS and NVIDIA. These companies aren’t trophies – they are our most trusted collaborators. The work we do wouldn't be possible without them.

There are three fundamental areas of this change we have identified to provide consulting and solutions: Artificial Intelligence, Software Engineering and Cloud. Please contact us if you share this ambition and are looking to make the most of your code and data to make something what matters.

Erik & Hitesh.

What Happens in a Machine Learning project?

Stage 1: Data Wrangling

Fetch Data

Clean Data

Transform data

Stage 2: Experimentation

Train model

Evaluate model

Stage 3: Deployment

Integrate

Monitor & refresh

Conclusion

Astellent Awarded Red Hat Western Leading Edge Partner of the Year

Astellent Joins NVIDIA Inception Program

Astellent Achieves Advanced Consulting Partner Status in The Amazon Web Services Partner Network

Top 3 Highlights From Red Hat Summit 2018

Astellent Named Red Hat Premier Business Partner

Why Did We Start Astellent?

START A CONVERSATION

START A CONVERSATION

START A CONVERSATION

START A CONVERSATION