Innovation Around Consistency

Today I saw this article lamenting the lack of innovation from roaster manufacturers. As anybody who has talked to me about the state of roast automation systems can attest, I've long complained that these are decades behind the state of the art in industrial process control. I would love to have the time and budget to develop something better, but the companies that can best afford to fund that work don't need it and the companies that would most benefit from what I would create can't afford to fund the work. That said, much of the article is about the need for consistency and I am not at all convinced that roaster manufacturers are the right companies to drive useful innovations there. I say this as a roaster who has developed software and observed measurable improvements in production consistency as I've added features to Typica.

Before we can really talk about consistency, it's important to know how to measure it. The thing that matters is consistency of the finished beverage and this can be approximated with cupping techniques, but setting out one or more cups from one batch and one or more cups from another and asking someone to characterize the difference will result in a lot of nit picking and identifying both real differences as well as ones that do not exist. It is much better to set out two cups of one sample and either 1 or 3 cups of the other, but have the position of the odd cup(s) out in a position that is random and unknown to the panel evaluating the samples. If you can't reliably pick out the right cup, then those differences might be due to errors in sample preparation, might be due to bad luck with a green coffee defect, might indicate a lack of uniformity within a single batch, but it doesn't indicate inconsistency in roasting from one batch to the next. If panelists are able to reliably distinguish among the samples, that does indicate a consistency issue and then it's important to ask how much variation is acceptable and explore options for reducing that variation.

Unfortunately, most roasters do not have the capacity to do this sort of sensory evaluation for every batch so it's also important to have some proxy metrics. Even if you are cupping every batch, these proxy metrics can often help contextualize observed differences in flavor when these occur and provide valuable insight into possible causes and mitigation strategies.

Profile data can be used as one such metric. The article linked above points to this data as being inherently bad, calling out the range prior to turn around as completely bogus. It's actually worse than that. The measurements don't truly approach reality until the rate of change peaks or levels out, which often happens 60 to 90 seconds beyond turnaround. That might mean that measurements in the first three minutes of the roast are unreliable. Fortunately, this is happening at a portion of the roast where minor variations have little to no impact on the cup.

What's important here is not that measurements are accurate, but that they are consistent. It's critical that roasters verify the consistency their measurements through careful observation of temperatures at the moments of observable physical changes in the coffee. I've used machines where these measurements are not at all consistent. It's often possible to mitigate that with very long pre-heats. Fortunately, there are roasting machines that do have good consistency here. I'm fortunate to roast on a couple Diedrich roasters that are quite good in this regard. When characterizing consistency of your measurements, you'll often find that there's variation with batch size. For example, on my 1Kg lab roaster the measurements that I have for a 100g sample roast are not at all the same as the measurements on my 2 pound product development roasts. The measurements on 100g samples are consistent with other 100g batches and the measurements on 2 pound roasts are consistent with other 2 pound batches. On my production roaster, measurements are consistent across a broad range from about 6-30 pounds (though at the upper end of that range special care must be taken in profile design to ensure that roast plans are achievable). Awareness of the limitations of your measurements is important.

Once you know you have consistent measurements, time within key temperature ranges becomes a useful proxy metric for consistency. I like to have time from the start of color change from green to yellow through color change from yellow to brown (300-330), start of brown to start of first crack (330-380), start of first crack through either end of roast or start of second crack depending on the roasting plan (380-max 430), and for roasts that go beyond the start of second crack, from that point to the end of the roast. This can be altered if needed for more nuanced roasting plans, but is a good starting point. Note that these are raw duration values. I've spoken to many roasters who want to use development time ratio (that is, duration from the start of first crack divided by total duration) as a consistency metric, but that value is a terrible concept which makes absolutely no sense in either a theoretical or practical sense. Attempting to use it as a proxy for consistency will produce less consistent results on any sensible metric.

While range duration can be captured easily in software, there is certainly room for improvement in how this information is presented and especially in how this can be used during a roast rather than just existing as a post-roast check.

As an interesting aside, the existence of this detailed profile data is a slightly more recent innovation than the 50 years mentioned in the article. Roaster manufacturers didn't start advertising profile roasting systems until the 1980's and Probat and other manufacturers didn't really take this mainstream until the 1990's. In the early 90's there were real discussions among manufacturers as to whether this profile roasting fad had staying power and if they really needed to include that thermocouple at all. More recently I've seen manufacturers experimenting with other options. Using an IR sensor instead of a thermocouple seems like a promising possibility, but reports from people who have used those say that the data is currently far too noisy to use. I don't believe that's an insurmountable challenge, but it is likely that roaster manufacturers do not have anybody who understands the statistical methods required to produce a useful measurement series from such a device.

Percent weight loss is another useful metric, and this can capture not only variations in how the coffee was roasted but also changes in the green coffee over time. Measurements from a degree of roast analyzer can also be helpful in verifying consistency, but it's important to note that there are many ways to roast a coffee to the same weight loss or color that will taste quite different in the cup. Any one proxy measurement will have limitations, but a combination of these can become reliable enough to be useful. Even with these, it's still a good idea to do controlled tasting and not rely entirely on any of these proxies.

Once you have a good set of procedures in place for measuring production consistency it's possible to more meaningfully evaluate how big a problem that really is, identify possible sources of production variation, consider changes intended to reduce that variation, and determine if those changes have succeeded in the goal of reducing variation. Sometimes it is a simple matter of training while at other times technology can be useful. While developing new features in Typica, this is something that I look at very closely as I don't want to introduce features that result in increased variation (feature prototypes have sometimes had this result and I've either scrapped or redesigned those features before making those changes available to other roasters). Several features now exist which have produced measurable consistency improvements.

That such large improvements are possible through improved software suggests that roaster manufacturers should not be driving this innovation. Looking at data systems developed by or as a product for roaster manufacturers, these have tended to stagnate quickly after an initial release, often have bizarre and inexplicable design limitations, and have little to no consideration of integration with systems outside of their product scope. Roaster manufacturers are bad at software. A welcome improvement would be exposing measurement and control interfaces and documenting these so that anybody can develop software to interface with the machine and make the improvements that they most need. Unfortunately, manufacturers that add connectivity options often reinvent the wheel, keep communication details proprietary and undocumented, and attempt to force lock in to their utterly anemic yet over-priced software offering. This actively discourages useful innovation and makes those connectivity options far less attractive.

This isn't to say that roaster manufacturers should not move beyond their current designs. Hardware innovations such as pressure profiling, additional sensors, precisely controllable motors, new approaches in heating, and the like may be useful and provide additional control over the expression of a coffee in the hands of a skilled roaster, but these also introduce variables that are important to understand and characterize when the goal is consistency. There's definitely a learning curve that roasters choosing to take advantage of these features must commit to.

Quick Tips for Improving Consistency

Once you're measuring production consistency, you can start improving it. Here are a few easy things to try that should produce measureable improvements.

Measure temperatures in Fahrenheit. This mainly applies to roasters operating outside of the United States. A Fahrenheit degree is smaller than a Celsius degree so for thinking about the same number of digits you get a significant improvement in precision. If you're planning control changes or end points at certain temperatures, you can now operate at improved precision.
Pay attention to your cooling profile. At the end of the roast, coffee is still very hot and chemical changes are still active that will impact flavor in the cup. Cooling variations explain a lot of roasting differences.
Use Typica. Improving production consistency has been a goal of many existing features and I'm aware of many promising areas for future improvement. That development happens at a roasting company means that there's more resistence to adding features that may sound good but actively work against the goals of the roaster.
Enable rate of change calculations. Subtle profile variations are much easier to see in the rate of change series than in the raw temperature data. This makes it easier to read ahead and allows faster and more precise corrections.
Enable profile translation at the green to yellow transition. Variations in the earliest portion of the roast when the coffee is still green have little to no impact on the cup provided that the roast perfectly matches the adjusted plan from the translation point to the end of the roast. This eliminates the temptation to compensate for these early differences that don't impact the cup with changes in a portion of the roast where that change will more significantly alter expression in the cup. This feature isn't for everybody as some have found that they lose control at the end of the roast when not matching things early, but I believe this is generally a problem with operation technique rather than a limit of any particular machine. On the other hand, at companies that have had success with this, the effect is more beneficial for less experienced roasters, though there is some improvement for everybody.
Continue to measure production consistency, both with your chosen proxies and through blind discrimination testing. When inconsistencies are found, try to identify the source of that inconsistency and possible strategies for dealing with that.
Support the development of Typica financially. While Typica already has a feature set that would benefit most roasters, there's also a lot of room for further improvement. I currently spend more on developing and making Typica available than comes in and that definitely limits what it is possible for me to work on. Your contribution may not seem like much to you, but if everybody using Typica supported its ongoing development that would add up to a useful sum.