Tags and LSPs where do I start?

Introduction

With the introduction of the Language Server Protocol (referenced as LSP from here on in), there’s increasing confusion over what they offer and if people still need tags.

It gets even more confusing for a beginner who is getting their feet wet with text editors and is suddenly given the recommendation to download universal-ctags, gutentags, coc and the LSP client of their choice. I’m not surprised people are confused so let’s go over what’s what and clear some things up…

What are tags?

A tag is a reference to a place of interest in your code. This could be a function definition, a variable declaration, a class or something else. A tag is stored in a ‘tags’ file containing all of the tags for your projects that were found. Here’s an excerpt of a tags file from one of my Typescript projects

recFindByExt	dist/src/file.js	/^function recFindByExt(base, ext, files, result) {$/;"	f
release	conf.py	/^release = u'0.0.11'$/;"	v
select	dist/src/lexer/statementFactory.js	/^        const statementMap = {$/;"	p	class:const
simple	dist/src/formatter/formatterFactory.js	/^        const formatMap = {$/;"	p	class:const
source	dist/src/formatter/formats/json.js	/^        const message = {$/;"	p	class:class

They can help you reason about your code and answer questions you may have such as:

Where is this function defined?

Where is this function used?

Where is this variable declared?

How do I create tags?

Tags are usually stored in a file and can be created by a program that generates tags. The most popular and widely used is universal-ctags better known as ‘ctags’.

What’s ctags?

ctags is a program to generate a tags file, that’s it! I’d recommend using universal-ctags, not exuberant-ctags since it’s actively maintained.

If you want an easier installation process, you can use exuberant-ctags just by doing sudo apt install ctags. As far as I know, you have to compile universal-ctags yourself.

Great, what are things like Global?

Global (or GNU Global) is another source code tagging system similar to ctags. There are a few differences but the biggest one that got me using it in the first place was that it could find references in code. As far as I’m aware, ctags is not capable of this.

There are many different source code indexing systems, as a catch-all, universal-ctags is a safe bet. Research others if you feel you’re missing something.

Do I have to pick one or the other?

Nope. Back in the day before the LSP came around I had a bastardised config that consisted of both GNU Global and universal-ctags. I used GNU Global everywhere that I could (I found it more reliable) and fell back to universal-ctags when necessary.

(There’s a vim plugin bundled with GNU Global that does all this for you)

It’s worth mentioning that vim supports universal/exuberant ctags out of the box compared to GNU Global. Read :help tags to learn more.

And language servers/the LSP?

A language server is an implementation that follows the language server protocol. These have an advantage over tags in that they are verifiably correct, as they actually parse the code you’re working with instead of relying on regular expressions and string matching.

Generally speaking, they are more reliable than a tags system.

What does a language server do?

A language server is capable of correctly answering questions such as:

Where is this method used?

Format this text selection for me

Rename this method across the codebase

Find where this variable is declared

A language server and the LSP are not just about code navigation. They offer code refactoring, formatting, documentation and more.

Okay, I still need you to explain the LSP…

With the LSP, every language you use in your code will have a language server (and client). You can see a reputable list here

If you want to use a language server in vim (or any other editor), then you will need to find a way of supporting the LSP and the language server you’re using.

As a real world example:

I work with PHP everyday and I use the LSP. I use coc (who doesn’t) as my method of talking to language servers and then I use coc-phpls as my language server for PHP.

That’s it! There is no configuration on my side needed for this apart from a few mappings:

nmap ]r <Plug>(coc-references)
nmap ]d <Plug>(coc-definition)
vmap ]f <Plug>(coc-format-selected)

Once that is done, I get intelligent completion all the way down. It does not choke or get lost on abstract classes or interfaces and has not once struggled to resolve a class no matter how deep the class hierarchy.

If you want to add another language server, you just find the implementation and install it. For example coc-tsserver for Typescript.

But I’m not using Coc…

That’s fine. There are lightweight solutions out there or you can literally query the language server yourself (it’s a lot of work and I wouldn’t recommend it unless you want to learn about the LSP);

For Neovim users, there is now a built-in LSP implementation and also configurations for popular language servers. These can be found here (I will be moving to this shortly, I don’t like selling my soul to coc).

Okay, I think I get it! Should I choose tags or a language server?

As always and irritatingly, the answer is “it depends”. However, in my own opinion, I will say to use language servers and then smatter some tags on top if you need it.

FAQ

Is there any reason to use tags if I’m using a language server?

I’d say no, however I’ve come across some edge-cases where a tags system actually works better than a language server.

Take the following example:

$mockCustomer = $this->createMock(Customer::class);
$mockCustomer
    ->method('loadById')
    ->willReturn(new Customer());

For those that don’t know much about testing or PHP, this is creating a mock object which will override the default loadById behaviour of the Customer object. Now the thing is, if I change the name of my loadById method in the Customer class itself and use a language server to find all usages and change those too, it will not pick this string 'loadById' up. Why? Because it’s just a string, not an actual call to the method. However, when we do change the name in the class, we need to update it here as well as it’s coupled to the name of the method.

In these circumstances, the naivety and simplicity of a tags program comes in handy.

Question taken from here

There must be other pros to tags?

Yes you’re right. Tags have built-in support in vim and are significantly easier to setup than a language server. See :help tags to learn more about the support offered.

Tags also offer multiple languages out of the box with much less overhead. With ctags, you have one program that handles potentially hundreds of languages. With the LSP, you need to download a language server per language. The average developer probably works with 3/4 languages daily so they’ll be running that many different servers at once on their machines. This is something that can very easily consume your CPU if you’re not careful.

Lastly (from what I can think of), there are built-in mappings to work with tags which makes working in large codebases a breeze. See :help tagsrch and :help CTRL-].

What’s the difference between Gtags and Ctags?

As mentioned previously, there’s not a huge difference except GNU Global (sometimes referred to as Gtags) is able to find references of method usage.

Can’t I just use grep?

You can, and there’s nothing wrong with that for a shotgun approach to code refactoring but if you want more useful/intelligent results to come through, use a tags system or a language server.

Useful links