June 2021

Upcoming Events


Introducing Datawave - Scalable Data Ingest and Query

Presented by Hannah Pellón
July 24 | 12:00 | Online

Big data storage can be challenging. Complex data models, scalability issues, and working with both structured and unstructured data. With Datawave, many of these issues are addressed with a flexible, scalable, and robust architecture that utilizes proven technologies such as Accumulo. Join us in July to learn what Datawave is and how it can help solve your big data needs.

Stay informed
We are excited to bring together data enthusiasts across Maryland and to provide a platform for collaboration, exploration, and learning. Register to attend.

Join the ranks of DAX speakers
Our speaker slots are starting to fill up, but we are looking for presentations that cover any of the following topics:
  • Interesting use cases from industry and government utilizing machine learningartificial intelligence, and data science
  • Analytic successes demonstrating technologies and innovation
  • Tools and techniques used for data pipelines, egress, and storage
  • Data exploration with visualizationdata journalism, and interactive reporting
Get in touch if you would like to submit a talk. 

Sponsor DAX
There are a lot of opportunities for you or your organization to get involved as a DAX sponsor: provide swag, host a Covid-safe viewing party, or be a named sponsor. Send us an email to get started.

Past Events

Did you miss last week's fantastic presentation by John Hebeler on graph analytics? You can watch or re-watch it on-demand via YouTube. While you're there, check out the other presentations on our YouTube channel.
Don't forget to subscribe so you don't miss a thing! 

Data News and Articles


UMD, UMBC, Army Research Lab Announce $68M Cooperative Agreement to Accelerate AI, AutonomyA pact aims to develop tech to reduce humans’ workload and risks on the battlefield and aid in Search-and-Rescue (SAR) missions. Tags: MD, Army, AI, SAR

What is DataOps? — Organizations that are able to tame, manage, and unlock their data assets stand to benefit in myriad ways, including improvements to decision-making and operational efficiency, better fraud prediction and prevention, better risk management and control, and more. In addition, data products and services can often lead to new or additional revenue. Tags: DataOps, Management, Efficiency

DeepMind Scientists: Reinforcement Learning is Enough for General AI — This article is part of our reviews of AI research papers, a series of posts that explore the latest findings in artificial intelligence. Tags: AI, Reward

Microsoft Uses GPT-3 to Let You Code in Natural Language — Microsoft is using OpenAI’s massive GPT-3 natural language model in its no-code/low-code Power Apps service to translate spoken text into code in its recently announced Power Fx language. Tags: AI, NaturalLanguage, PowerFX, GPT-3

Analytics is a Mess — Data isn’t objective, and analysis isn’t structured. It’s just as creative—and just as messy—as the papers I ran away from in college. Tags: Data, Analytics, Mode

Automated Data Wrangling — A growing array of techniques apply machine learning directly to the problems of data wrangling. They often start out as open research projects but then become proprietary. How can we build automated data wrangling systems for open data? Tags: Data, DataCleaning

Developing MLB's Automated Ball/Strike System (ABS) — Early in 2019, Major League Baseball announced a partnership with the Atlantic League of Professional Baseball (ALPB) to test new playing rules in order to observe the effects of potential future rule changes and equipment. One of those initiatives was the creation and testing of an automated ball and strike calling system (ABS). Tags: Baseball, AI

IBM’s Project CodeNet Will Test How Far You Can Push AI to Write Software — IBM’s AI research division has released a 14-million-sample dataset to develop machine learning models that can help in programming tasks. Called Project CodeNet, the dataset takes its name after ImageNet, the famous repository of labeled photos that triggered a revolution in computer vision and deep learning. Tags: AI, IBM, Data

Nasty, brutish and short: The life of the modern CDO — The role of CDO is still poorly defined, and CDOs are frequently not set up for success within their organizations. The average CDO tenure is just two and a half years; clearly, CDOs are not being given much time to demonstrate their value before they are asked to move on, or (just as likely) find that the grass looks a little greener somewhere else. Against this backdrop, should we conclude that the CDO role is a poisoned chalice? Tags: CDO, Management, Data

Building Effective Data Science Teams — Whether you are the first “data person” at your organization or leading a team of hundreds, we know success is not based on just technology; it requires people to create a productive, effective, and collaborative data science team. Tags: Management, DataScience, Teamwork

Data Tools and Resources


Is FireBolt the Future of Data Platforms? — A relatively unknown player aiming to de-throne Snowflake, Redshift, and BigQuery. Tags: Data, Firebolt, C++, Kafka

google/zx — Bash is great, but when it comes to writing scripts, people usually choose a more convenient programming language. JavaScript is a perfect choice, but standard Node.js library requires additional hassle before using. The zx package provides useful wrappers around child_process, escapes arguments and gives sensible defaults. Tags: Bash, Scripting, Javascript

OriginLab Releases New Data Analysis and Graphing Software, Origin 2021b — OriginLab, a leading publisher of data analysis and graphing software, today announced the release of Origin® and OriginPro® 2021b. This latest version of OriginLab's award-winning software application adds over 100 new features, Apps and improvements, further enhancing Origin's ease-of-use, graphing, analysis and programming capabilities. Tags: DataAnalysis, Origin, Graphing, Visualizations

Introducing Dataflow, a Self-Hosted Observable Notebook Editor — Dataflow is a new tool that lets you run, edit, and compile Observable notebooks locally on your own computer, with any text editor you want! Tags: Dataflow, Editor, Thick-client

jina-ai/jina — Jina allows you to build deep learning-powered search-as-a-service in just minutes. Tags: ML, DeepLearning, Index, Architecture

How To's and Tutorials


Best Practices Around Production Ready Web Apps with Docker Compose — Here's a few patterns I've picked up based on using Docker since 2014. I've extracted these from doing a bunch of freelance work. Tags: Docker, WebApps, Tutorial

Understanding the Data (Error) Generating Processes for Data Validation — Statistics literature often makes reference to the data generating process (DGP): an idealized description of a real-world system responsible for producing observed data. This leads to a modeling approach focused on describing that system as opposed to blindly fitting observations to a common functional form. Tags: DPG, Error, Validation

Time Series Forecasting (Part 2 of 3): Selecting Algorithms — This is the second article of a series focusing on time series forecasting methods and applications. Tags: MSCloud, AI


Share Your Project
Have you been working on a data project and are ready to share your methods, processes, or results? Contact us to get started.
Be a Do-Gooder
Are you looking for a way to get involved in the community and make an impact? Check out the volunteer opportunities with U.S. Digital Response.  

Book Review Opportunity
Are you interested in reviewing an O'Reilly book for the publisher and sharing your views with the world? As if that isn't enough, you get to take a book home to enjoy as well. Send us an email and we'll get you started.

Data Analysis Volunteer Work to Support Baltimore City
Are you an expert with data and willing to mentor, or are you an up and coming hobbyist looking for a side project to work on? We have put together a group to focus on a few problems working with Baltimore City data and need your help. The current project focuses on data parsing and analysis for the Baltimore Board of Estimates. If interested, please send us an email or join us on Slack to discuss building a side project group.

Considering a Career Change?
Are you a software or system engineer, data scientist, analytic developer, or cybersecurity expert interested in learning about new opportunities?
Please send us an email to learn about the opportunities available with our partners.

Are You Hiring?
If your company is looking for data scientists, data engineers, software engineers, and other data related experts, please reach out so that we can help our members find new opportunities.
Please send us an email introducing your company and needs.

Get Involved with Data Works!
Want to be more involved in our data science community? If you have experience running workshops, hack-a-thons, curating newsletters, or are just interested in helping to grow the meetup, please send us an email!

Erias Ventures
Erias has an immediate need for Software Engineers, System Engineers, Test Engineers, Data Scientists, and System Administrators. External referral bonuses are available. For more information, please contact us at


Our sponsors help us bring data analysis Meetups, conferences, and newsletters to Maryland data enthusiasts. If you're interested in joining this prestigious group, send us an email
If you are interested in speaking, hosting, or sponsoring a meetup, have opportunities to list, or local news to share, please email

This email was sent to <<Email Address>>
why did I get this?    unsubscribe from this list    update subscription preferences
Data Works MD · 101 W Dickman St · Baltimore, MD 21784-9239 · USA

Email Marketing Powered by Mailchimp