Scott Miller
Software Engineer
GitHub: scott-a-miller

I've been writing software professionally for over 15 years. For most of that time I've been working at a chem-informatics software company in Columbus (Leadscope), while consulting with a design group in New York (Tonic Group). I have been lead architect on dozens of projects, including scientific desktop applications, online commerce sites, B-to-B web services, and third-party software integrations. As a principal engineer working at a small company, I'm involved with all aspects of software development, including feature definition, UI design, architecture, and technology evaluation.

Some things I've been using most frequently:

Languages: Java, Ruby, Kotlin, JavaScript, HTML/CSS, PHP
Frameworks/Libraries: Jersey, Dropwizard, Swing, React, MobX, Jdbi, JUnit, XPP, iText, Nanoc, jQuery
Tools: Git, IntelliJ, Emacs, Ant, Maven, Unix tools
Databases: PostgreSQL, Derby, MySQL, Oracle

Other things I've used in production, but infrequently, or not recently:

Languages: C, Flex, C++, NSIS
Frameworks/Libraries: Play!, JAX-WS, KNIME, JPA, Hibernate, Jackson, Ruby on Rails, Google API, JUNG, Protocol Buffers, StringTemplate, Lucene, Elasticsearch, WordPress, CakePHP

Formal education:

1996 Bowling Green State University: B.S. Computer Science
1999 University of California, Davis: M.S. Computer Science - with thesis (Specification of Network Access Policy and Verification of Compliance Through Passive Monitoring)

Work history:

Consulting Software Engineer: Tonic Group, Inc. — 2007 - Present
Senior Software Engineer: Leadscope, Inc. (Instem) — 2001 - Present
Developer: Single Source IT — 2000
Software Engineer: Marimba, Inc. — 1999
Research Assistant: Computer Security Research Laboratory, UC Davis — 1997 - 1999
Teaching Assistant: Computer Science Department, UC Davis — 1996 - 1997
IS Summer Intern: Owens Corning — 1995
Mainframe Operator, Lab Consultant, Technical Writer: Computer Services, BGSU — 1991 - 1996


Alexander Amberg, Lisa Beilke, Joel Bercu, Dave Bower, Alessandro Brigo, Kevin P. Cross, Laura Custer, Krista Dobo, Eric Dowdy, Kevin A. Ford, Susanne Glowienke, Jacky Van Gompel, James Harvey, Catrin Hasselgren, Masamitsu Honma, Robert Jolly, Raymond Kemper, Michelle Kenyon, Naomi Kruhlak, Penny Leavitt, Scott Miller, Wolfgang Muster, John Nicolette, Andreja Plaper, Mark Powley, Donald P. Quigley, M. Vijayaraj Reddy, Hans-Peter Spirkl, Lidiya Stavitskaya, Andrew Teasdale, Sandy Weiner, Dennie S. Welch, Angela White, Joerg Wichard, and Glenn J. Myatt, (2016) Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses. Regulatory Toxicology and Pharmacology 77: 13–24. doi:10.1016/j.yrtph.2016.02.004
Kohonen, P., Benfenati, E., Bower, D., Ceder, R., Crump, M., Cross, K., Grafström, R. C., Healy, L., Helma, C., Jeliazkova, N., Jeliazkov, V., Maggioni, S., Miller, S., Myatt, G., Rautenberg, M., Stacey, G., Willighagen, E., Wiseman, J. and Hardy, B. (2013), The ToxBank Data Warehouse: Supporting the Replacement of In Vivo Repeated Dose Systemic Toxicity Testing. Mol. Inf., 32: 47–63. doi:10.1002/minf.201200114


Computer Science Achievement Award — 1996 (BGSU)
Esther Hayhurst Award (Achievement in Social Studies) — 1991 (BGHS)

A sample of projects:

Leadscope Compute Service (Leadscope)
A performant and scalable chem-informatics computation service — More

I designed and implemented a new server architecture augmenting our Leadscope products. This new service is able to perform our most computationally intensive tasks, in a parallel way, allowing the work to be divided among any number of servers. Using Amazon Web Services, we are able to spin up these configuration-less servers for any immediate demand.

Leadscope Enterprise (Leadscope)
A chem-informatics platform — More

Leadscope's core product is a flexible chem-informatics decision support platform with data warehousing capabilities. My main contribution on this product has been the implementation of the user interface. All of the graphing components were built from base Swing components to handle large datasets; e.g. scatterplots, histograms, box plots, heat maps, RPSA trees.

More recently I've also been maintaining much of the other parts as well; e.g. DAOs, serialization, full text searching, web services.

Web Services and React Client (Leadscope)
RESTful web services for the Leadscope server and React-based web client. — More

I extended the web services of the Leadscope Enterprise Server with RESTful resources; providing access for a new web client I developed for toxicity database searching, and statistical model application.

KNIME Web Services Integration (Leadscope)
Custom KNIME nodes integrating Leadscope web services — More

KNIME is an open-source, data analytics workflow platform. It allows analysts to configure chains of data sources, transformations, and visualizations in an intuitive graphical interface. A number of chem-informatics scientists have adopted KNIME as a way of creating transparent, and reproducible analytic processes.

Our Leadscope Enterprise product provides web services for various chem-informatics tasks; e.g. text and chemistry-based searching, toxicity prediction. I created custom KNIME nodes allowing customers to easily integrate these services into their data workflows.

Reporting Frameworks in JRuby and Kotlin (Leadscope)
Dynamic reports in JRuby and Kotlin for Java applications — More

Leadscope's products generate a plethora of reports most of which are highly dynamic. For the first few years we worked with Sitraka JClass, and later JasperReports. Both required a lot of manual coding for our complex reports. To expedite this process, I created a reporting framework in JRuby utilizing the iText library.

It includes an internal DSL for specifying the templates, cascading styles, hooks for testing, and seamlessly supports RTF and PDF.

Later, as the number of reports continued to grow, maintenance and code reuse became more time consuming partly due to the dynamic typing of Ruby. I then ported the framework to Kotlin, a static-typed language which amazingly also supports the DSL features that we were using in Ruby.

Insilicofirst (Leadscope)
A broker site for online chem-informatics — More

Insilicofirst was a collaborative effort among several providers of chemistry-based predictive toxicity software. The main product of that effort was a web portal somewhat analogous to Expedia. The user provides a chemical structure of interest (by connection table, name, or id). The portal submits the structure to participating vendors whose web services search for exact and similar matches in toxicity databases, and perform in silico predictions. The portal collates and summarizes the available information. The user can then purchase the detailed results via credit card for download.

I was responsible for the overall architecture, implemented the portal portion, and led the discussions defining the vendor web services. Each vendor was then responsible for implementing and hosting their own web services.

A similar platform was later created as part of an SBIR grant for NIEHS. In that context, the portal and vendor services were installed at the customer site, and the retail portion was dropped.

ToxML Community Site (Leadscope)
A community site for managing the lifecycle of ToxML schema changes — More

ToxML is an open data exchange standard for a variety of toxicity data and structure-related information. Leadscope led the initial consortium defining the standard, funded in part by a grant from NIST. We adopted ToxML as the fundamental storage mechanism for toxicity data in our products. More recently, we've extended the reach of ToxML through the creation of The ToxML Standards Organisation. Contributing to that effort, we created a community website where people can review the current schema, and directly make recommendations for changes and additions. The site facilitates the viewing of all recommended changes, and for curators to manage releases; i.e. accepting changes, tagging releases, and exporting the related artifacts (an XSD schema, and reference parser library).

I designed and implemented the UI, lead requirements discussions, and implemented the export functions for the schema and reference parser library.

ToxML Repository (Leadscope)
A client/server application for storing and versioning toxicity data — More

A part of supporting ToxML is a tool for visualizing, editing, and creating ToxML documents. This was surprisingly challenging with many hundreds of unique fields across the dozen or so study types and structure-related information. A prior implementation based on JGoodies proved too complex to maintain, especially at that early, fluid point in ToxML's definition. When I took over the project, I shifted the approach to first model ToxML with abstractions for collections, primitive types, and composite objects, implement editor components by type, and move all of the form and vocabulary definitions to XML documents. This allowed ToxML to continue to change and expand without complicating the base code.

More recently, I extended the tool into a repository application: splitting it into client and server components, and adding version control (history, differencing, and tags).

Toxbank User Interface (Leadscope)
A user interface for searching and visualizing new toxicity data — More

In 2004, the EU banned animal testing for cosmetics. The SEURAT-1 research initiative was then created to advance alternative methods for assessing safety. Leadscope joined the subproject, Toxbank, responsible for warehousing the newly generated toxicity data. The project has some interesting challenges in that the data being collected is from new techniques and study types, whose varied schemas are still in flux. To address this, flexible data representations were adopted (e.g. ISATab and RDF). My personal role has been the user interface; a Play! application that accesses several repositories through RESTful interfaces.

Simple Web Application Framework
A web framework in Java — More

For a few of the older web applications, I used a basic framework that I created. It's a handful of classes built on top of raw Java servlets that handle routing, serialization, session management, and template rendering with StringTemplate. These days, however, I would reach for Jersey under any of the same circumstances.

VS+Company Representation App (Tonic)
A site for managing photographer representation — More

This is an internal business application we made for a photography representation company. It handles common business tasks like scheduling, invoicing, as well as a work flow for usage rights negotiation. My role was entirely on the server-side operations; web services, business logic, persistence, and some basic dev-ops work. The services were designed to work with SmartClient JavaScript components (the components expect a specific message format provided by Isomorphic's commercial server, which we did not adopt). I also integrated the system with the Google API for user authentication, and managing contacts.

American Express: Restaurant Briefing (Tonic)
A periodical for the restaurant industry — More

This was a WordPress site for an American Express periodical. The project involved extending WordPress into a full content management system.

Cotton, Inc. Trends Forecast (Tonic)
Seasonal forecast trends in fashion — More

For several years now, we've been putting together seasonal fashion trend forecast sites for Cotton, Inc. The sites are completely redesigned on a regular basis. Initially, this was implemented in Nanoc (a static-site compiler in Ruby). I later ported it to React, allowing the designers to define the site with JSON configuration files.

Network Security Project (UC Davis)
Network monitoring and policy specification — More
At the security lab at Davis, one of our projects involved working with a prominant semi-conductor company. Over the course of a few months, we provided network monitoring in one of their research areas. My contribution was to extend Network Radar (a security product developed by one of the lab's alumni). I created a DSL for specifying network access policies, which were then applied to observed traffic, signalling alarms and/or laxing the security policy to fit acceptable observed patterns. At the end of the monitoring period we were able to provide an audit with recommendations for improving security. The policy specification and enforcement became the core component of my master's degree thesis. Correlations
A hobby project correlating board game ratings — More

I've been playing board games forever, and have been following for a while. At one point, they released an API to access ratings and other information, and coincidentally I was looking at Pearson co-efficients at work. It seemed like it might be useful for automatic recommendations. So, I wrote a scraper to collect users, games, and ratings, then calculated correlations between games and then between players, and made the results searchable through a simple web interface. The user-based recommendations didn't work out that well, but the correlations between games were effective.

Macintosh Games
A few arcade games for MacOS 7 — More

Back in college I wrote a couple of games for the Macintosh (System 7), written in C, using the Sprite Animation Toolkit. One was a Spacewar/Scorched Earth mashup. The other was a remake of Labyrinth, an old Apple 2 game. They were both public domain, and the latter got a nice write-up in Inside Mac Games.