Logging

Usually topic of logging is skipped or omitted, because logs are not critical part of the system. Indeed when it comes to debugging and checking on what happened and why did application failed - logs turns out to be very useful. Having good logs is more nice to have feature, priority of which lies in important not urgent quadrant, the one most worthy to invest.

Good logs is a worthy investment.

During the work on some distributed system or application that runs independently somewhere else than on your machine it makes sense to have some logging management tool. It's a bad practice when developer reads logs directly from application output in a remote runtime environment. Main reasons to have this opinion is accessibility. If you want someone to be able to read your logs - you need to grant permission to break into the production runtime environment. Second argument against - is an ephemeral nature of runtime environments, so you may loose all your historical logs as soon a the runtime environment is restarted or re-created on other server. Third is about scalability, when application is scaled horizontally looking for necessary logs becomes a difficult and challenging task.

You don't need a logging management tool only when the runtime environment is you own machine.

Usually the logs you application write are put to some standard output where they are read by the logging management tool agent and pushed to the remote storage. Most of log tools does require you to pass logs in form of key-value pair records serialised into some common format like JSON, divided by new lines. The most common keys of those records are level and message.

Level does display the level of attention the log requires and mainly is used to split the errors from non errors. Most common levels developers are using nowadays are:

  • error - system failed, let the developer know why
  • warning - something happened that should not, but we're aware of such possibility, let the developer know
  • info - something ordinary happened, this log entry just confirm it actually happened
  • debug - junk logs used during development - don't push that to remote repo please

As non written standard says logs format require  level and message values, but logs are not limited to them. Another common mistake is putting extra information into the message value. As logs entry object is not limited to the message we should extend the log entry with key value pairs of such extra information.

Here's an example

// Bad
{
   level: "info",
   message: "[UserModule]: User with id=123 created"
}

// Good
{
  level: "info",
  message: "User created",
  module: "User",
  userId: 123
}

So when we're putting an extra info as distinct log entry fields we win 2 things. First we make them indexable, so our centralised logging management tool may index all the logs containing userId=123 and not only in user module, but everywhere in the application. It means we can highlight one single flow or thread of the data in all the logs mess. Second benefit is more subtle - since the message does not change from event to event - we can use to build a custom metric. Just by counting the logs entries having message="User created" we can get an amount of created users.

Do not include details into a log message instead append them as separate key-values into the log entry.

Also you may want to include some list of predefined keys on the level of application, module, routing or flow - just not to add them each time manually. This practice is called logging context

const moduleLogger = new Logger({ moduleName: "User" })

createUser() {
   ...
   moduleLogger.info("user created", { userId });
}

Another common practice in logging of web based and distributed applications is to attach same requestId to all the logs entries related to the same request. Usually requestId is generated and attached on the level of the client facing handler. Here's some example with express.js:

app.post('/createUser', async (req, res) => {
	
    // generate request id
    const requestId = randomString();
    
    // log request id
    logger.log("createUser", { requestId });
    
    // pass request id to the handler
    const resp = await createUserHandler(req.body, requestId);
    
    res.send(resp);
});

Also requestId can be passed to the downstream services in form of a header, unofficial header name is x-request-id.

const createUserHandler = async (data, requestId) => {
   ...
   // user is create here
   ...
   // call email service to send welcome mail
   await fetch({
   		method: 'POST',
   		url: EMAIL_SERVICE_URL,
        body: {
        	template: 'welcome_mail',
            userId: user.id
        },
        headers: {
        	'x-request-id': requestId
        }
   })
}

And the route of mail service may should look like this:

app.post('/sendEmail', (req, res) => {
    const requestId = req.headers['x-request-id'] 
		|| randomString();
    
    ...
});

Attach Request Id to web based application logs.

Main caveat of the code above is passing of request id as an argument. Real code is usually heavily nested and adding an extra argument to each function of the call stack may be too tidies and does not seem right. And there is a solution, at least in node.js world. It's called asyncLocalStorage - some kind of global scope for function execuction context. So the code above using asyncLocalStorage would look like this:  

import { AsyncLocalStorage } from 'node:async_hooks';

const asyncLocalStorage = new AsyncLocalStorage();

app.post('/createUser', (req, res) => {
    const reqContext = { 
    	requestId: request.headers['x-request-id']
        	|| randomString()
    };
    
    asyncLocalStorage.run(reqContext, async () => {
    	const parentContext = asyncLocalStorage.getStore();
    	
    	// log request id
    	logger.log("createUser", { 
        	requestId: parentContext.requestId
        });
    
    	// no requestId in arguments
    	const resp = await createUserHandler(req.body);
    
    	res.send(resp);
    });
});


createUserHandler = async (data) => {
   ...
   // user is create here
   ...
   
   const parentContext = asyncLocalStorage.getStore() || {};
 
   await fetch({
   		method: 'POST',
   		url: EMAIL_SERVICE_URL,
        body: {
        	template: 'welcome_mail',
            userId: user.id
        },
        headers: {
        	'x-request-id': parentContext.requestId
        }
   })
}

As you see above we can get the parentContext from any nested function and don't need to rewrite all codebase adding the requestId argument to each function.

Use asyncLocalStorage to pass context to nested functions.

In the end I would like to add a note about log security. Logging systems considered to be one of possible of vectors of attack of an application. So it's possible some 3rd party would be able to steal your logs, so log carefully. Do not log any sensitive data, especially auth tokens, recovery urls and keys. Also it's bad practice to log any user related personal info (pii). So be careful with what you log.

Do not log any sensitive data.