← Back to all articles

Do Not Disturb the Employee

Vladimir Dietrich · July 7, 2025 ·4 min read

I received a request for statistics in the workplace.

An Excel spreadsheet with relevant data.

How do we populate this Excel sheet?

The person who asked me for more statistics is an excellent data annotator: they have their own spreadsheet with beautiful annotations about everything they do.

In their spreadsheet, percentages measure various types of efficiency metrics, clearly coloring the computer screen.

How can we convince other employees to also be excellent data annotators for statistical purposes?

In our conversation, I used a skeptical approach to data entry: even if we create beautiful forms, who would guarantee that all employees will always fill in the relevant numbers to compile statistics for the entire sector?

My skeptical view responded: no one guarantees it. On the contrary, it is almost certain that someone will not fill in some data someday, or many employees, many data points, especially over a long period. After a few years, then, the chance of the additional effort not happening fully or partially is almost 100%.

We were at this impasse: the statistics are very much needed, and they need to be for the entire sector. We cannot be left with the doubt if someone did not fill in something when publishing the statistics for the entire sector.

A parallel demand was the traceability of documents: we needed to quickly search for documents, and we could also use the spreadsheets that generate statistics to search for document numbers.

Here the problem of uncertain adherence becomes even greater: if there is doubt whether a single employee did not fill in the spreadsheet, how can we say that we did not find the document if we are not sure if all employees always fill in the spreadsheet?

We let our brains heat up a bit, synapses working.

A solution appeared that I consider not only elegant and functional, but also delicate, respectful, that does not bother anyone, truly automated.

This sector only has one certainty in its entire production line: at the end, a report will be produced.

How the report is made, whether or not parallel spreadsheets or forms are filled out, how long it takes, using which resources, none of this is certain. There is freedom in how to produce because it is an area that requires intelligence, analysis, in which each employee has their method to reach the final result.

The final result, however, is certain: a report. In pdf.

It may have been produced via Word, via Google Doc, via any means. But it will be, at the end of everything, a beautiful report in pdf.

Ready then: let's acquire a machine for no more than a thousand dollars, with ample RAM, a good NVIDIA video card, and a good CPU to run artificial intelligence locally - just to keep the reports off the web. Yes, it would be easy to implement, in this case on any simple machine, a query to artificial intelligence via API. But we opted for local artificial intelligence.

A Python script monitors the network folder that receives all the ready-made PDF reports from the sector.

When a new report appears, the artificial intelligence extracts the fields we need to compile statistics.

Since the way of writing reports varies a lot, I chose not to use regex, or even try.

I confess that it would be possible to adapt a good regex to the variation of ways to write reports, but in fact I wanted another advantage; scalability in two aspects:

  • Not having to adapt, or needing to adapt the code less with each new incoming employee. This allows the code to last for years or decades without needing major adjustments.
  • I want the team of employees to be able to add new statistics without depending on a super specialist in regex, usually just adding the new question to the query to the artificial intelligence, which then adds the new information to a new column in the spreadsheet. Freedom to innovate quickly! ☀️
  • The solution is more powerful than we imagined when faced with the initial problem. With this solution, even a one-to-many relationship was implemented for the first time in the sector. As artificial intelligence returns an array (a set, to remove duplicates) when there are many things of the same type to answer, we are now compiling statistics and ways to quickly find items that occur in indefinite quantities for each report.

    In this case, we created a new tab in the spreadsheet and added all the items from the array returned by the artificial intelligence next to the unique value that determines which row of the main table these items belong to.

    One of these items is highly sought after by those who request information from the sector, so this innovation will streamline the process of the entire sector and those who consult the sector.

    Of course, we could use an SQL database, or another type; we could use Google Sheets, various options.

    I preferred to keep what employees are used to seeing in their daily lives.

    This type of solution makes me so happy, you know why?

    Because employees won't even notice the solution already happening.

    Their lives will not change at all.

    If they want to change, one day, I know there is a human resources sector, or they can get together and change.

    But it is not I, an admirer of technologies, who will ask an employee or an entire sector to change.

    If I can be as discreet as a ladybug that lands in the place, no one sees it, light, but brings good luck - or statistics and traceability - I want to be the ladybug that no one notices!

    Great day to all! ☀️☀️☀️☀️