Automating newspaper distribution with Airtable?

PhilipK · August 17, 2019, 9:04am

I manage distribution for a monthly community newspaper, run by volunteers, and I’m looking into how we can use automation in the distribution process, and specifically whether Airtable is the right tool to be looking at, having heard about it recently in the Automators podcast.

What I’m hoping for in posting here is just a brief nod to say whether Airtable can handle it. I’ve contacted Airtable sales already but they can’t schedule a call until next week, and I’m keen to start work on it this weekend. If the word here is that it is capable, I’ll probably just dive straight in and not wait for the call with Airtable.

We distribute 12,000 copies to 300 local outlets who each stock from a handful to several hundred copies. These stockists change from time to time, and the quantity stocked at each outlet may vary from month to month.

We currently use a spreadsheet to generate printouts of the stockist list for each delivery round, but it’s grown unwieldy so I’m looking for a more powerful and flexible method, one which is also amenable to automating events such as posting to social media when a new edition is marked as being available from a specific outlet. But the key question I haven’t been able to find an answer for is whether Airtable is flexible enough to solve the following:

Broadly, we have Editions, Quantities, and Stockists, so for each Edition we deliver a specific Quantity to a specific Stockist, and we generate lists for distributors to follow:

We need to know, per Stockist:

the quantity expected to be delivered this month
the quantity delivered last month

Then during distribution we need to be able to update per Stockist:

the actual quantity delivered
the quantity remaining from last month
any change in quantity for next month

For each Edition we need to track the overall balance of copies remaining to be distributed, and for each Stockist we want to graph the trend of quantities over time.

That’s the core requirement, but we would also want to use it to drive a list of outlets on our web site, allow new outlets to register via a web form, produce metrics to support advertising and sponsorship applications, etc. etc.

I’m quite openminded about whether we abandon printed lists entirely in favour of working from a phone or tablet, or whether we use a combination of printouts and devices with the detail being updated in comfort back at base.

I’m pretty solidly iOS/macOS.

Thank you if you’ve read this far, and thanks in advance for any suggestions you may be able to give.

sylumer · August 17, 2019, 9:57am

What’s the “unwieldy” aspect that you are hoping to address? That’s the point that piques my interest; my day job is in a niche field of IT consultancy (not app or site development), so it’s a kind of professional trigger to dig further when I hear phrases like that.

Everything else seems to be solvable through a database and web APIs/services which AirTable in combination with other services should be able to provide. The web site integration will depend to some extent on your web host platform, but in theory it is certainly possible to pull data in and post data from.

PhilipK · August 17, 2019, 12:51pm

Thanks Stephen,

The current setup is unwieldy because it started as a simple list of stockists but has grown more complex as more outlets are added every month and the routes get more complicated, but it was never set up as a database, there’s nothing clever about it. Extracting anything useful for long-term analysis is daunting.

The current spreadsheet has columns for stockist name, status (active/inactive), location, usual delivery quantity, previous month quantity, and then columns for quantity delivered (one column for each delivery day), balance outstanding, and notes.

For each new edition the previous month’s spreadsheet gets duplicated and the quantities are deleted, then the relevant quantity data is copy-pasted from the old sheet to the new one. The sheet gets printed, and then updated in pen while deliveries are being made, and the spreadsheet gets updated back at base.

So we now have perhaps 10k rows of dumb data spread over dozens of sheets, with no relationships between them, which is obviously a pretty lame way to do things and so that’s why I’m looking at setting up a better method. It could be that improving the spreadsheet is the way to go, but I’d rather use the time to build something more versatile, with more possibility of expanding it with other bells and whistles.

sylumer · August 17, 2019, 1:09pm

Improving the spreadsheet could generally improve things, but you’re absolutely right to scale up to something more sophisticated (i.e. a database) for structuring your data given what you want to do.

In reference to something above on displaying data on web APIs pages, it looks like basic support for embedding is available via iFrames. Not ideal, but no doubt a big step forward from where you are now.

PhilipK · August 17, 2019, 1:38pm

For the web page it may be that exporting a view to CSV and then parsing the relevant data to html would be another way to go. All we’d need is a list of the top twenty stockists and it wouldn’t need to be live data, just updated once a month.

Or it might be possible to get the data into a map view somehow.

But back to the main question, I think I need to open a test account at Airtable and have a play to see if the relational aspect is doable within their system. I’m not a database guy so it’s not obvious to me from reading their docs and examples whether it’s the right tool.

RosemaryOrchard · August 17, 2019, 2:35pm

This sounds doable with AirTable - though depending on your implementation you may end up with some very wide tables. Their forums are good and there are lots of consultants there who you can pay to set something up for you too if you run into trouble.

PhilipK · August 17, 2019, 2:38pm

Thanks Rosemary. Can I ask you to expand a little on why you think it might lead to very wide tables?

RosemaryOrchard · August 17, 2019, 2:51pm

You may end up with all the suppliers in the editions table, which is not ideal - though it allows for some ease of usage. Personally I’d make a “supplier editions” table which refers to suppliers and editions and lets you log how many copies they want. Then you can pull the sum into the edition table (and the sum of left over editions) too.

sylumer · August 17, 2019, 3:42pm

If you have never studied database design, you may wish to familiarise yourself with something called “normalisation”. It’s about building efficient ways to structure and access your data. It will then hopefully make more sense about why you might want wide tables, multiple tables, etc.

It will also make your database easier to understand, maintain, and extend in the future.

Whilst I spent days studying and applying the theory at university and then many further days extending that study in working life, the key points probably take less than an hour to get your head around if you have a technical slant as I’m guessing you might from this and previous posts. To that end, this page looks like it might give you a good start.

https://www.studytonight.com/dbms/database-normalization.php

As someone who spent over a decade developing both mainframe and client-server systems utilising relational databases I’d also recommend designing your database off-system at first; no matter how simple the data structure might seem at initially! Also adopting a clear and consistent naming convention for your tables and columns is something to think about. Both will pay dividends in the long run.

On design, you might want to research something called “Entity Relationship Diagrams”. There’s a couple of videos I found that look reasonable, but they are pitching a particular piece of diagramming software. Set that to one side and just focus on the database design content.

Part 1

Part 2

On sample naming considerations, things like this are the sorts of things that immediately spring to mind:

Name tables and columns in the singular rather than the plural - a table called “supplier” vs. a table called “suppliers”.
Use camelCase, PascalCase or snake_case.
Check for maximum name lengths and ensure you never have to truncate names.
Write down the rules you use for naming so you can be consistent with future developments.

Hope that helps.

PhilipK · August 17, 2019, 4:07pm

Thanks Stephen, that’s sorted out my homework exercises for tomorrow! It all looks hugely helpful and I’m very grateful for your time.

PhilipK · August 18, 2019, 11:55am

This has been tremendously helpful. When I started sketching out the tables a few days ago it turns out I was doing normalisation without knowing the name, so it’s good to know I was on the right track there, similarly the entity relationship stuff is not entirely unfamiliar. But those tutorials have helped clarify and focus my thinking, and led me to generalise the problem to one of fulfilling product orders: the Edition is the product, the Stockist is the customer, and the Quantity to be delivered is the order, which (with a few other tables for location, etc.) will let me derive the distribution lists I need. So I’ve got plenty of work to do there, but that’s not really the realm of Automators so I’ll close the discussion of the database aspect here, with my thanks for your help, but I’m sure I’ll be back when it comes to adding the automation. In any case I’ll report back.

PhilipK · August 21, 2019, 3:53pm

Looking into how best to organise my tables, I found this blog post about the primary column in Airtable and why it’s not the same as a primary key. Although for my purposes I think it’s probably still useful to regard it as such.

sylumer · August 21, 2019, 4:35pm

They are saying you don’t get to see the real primary key, and that it doesn’t conform to the rules associated with a primary key.

But reading between the lines it says that if you want to link tables you need to use it. That will make it in effect a foreign key (read about those yet?) for the relationship… but foreign keys utilise primary keys?!!

I assume that because of that one of two things occurs. Either you get a risk of non-unique matching (Urgh), or Airtable does stuff behind the scenes subbing in the real primary key (which might actually be worse as what you see may not be what you get).

That then begs the question of what now?

Well, in my opinion, your instinct to treat the primary column like a primary key is a solid one. Even without any referential integrity or uniqueness constraints, doing so will give you a meaningful and reliable set of data structures to work with. By designing the data structure with that relational database mindset it actually becomes somewhat independent of the constraints and nuances of the platform it is hosted on. The platform really just needs to be capable of manipulating the tables of data.