[MDEV-26288] Introduce and apply clang-format to Spider storage engine Created: 2021-07-31  Updated: 2021-10-16  Resolved: 2021-10-16

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Fix Version/s: N/A

Type: Task Priority: Minor
Reporter: Nayuta Yanagisawa (Inactive) Assignee: Nayuta Yanagisawa (Inactive)
Resolution: Won't Do Votes: 0
Labels: None

Issue Links:
Relates
Epic Link: Spider Refactoring

 Description   

I'd like to use clang-format to make the style of the code consistent. Applying clang-format to the whole existing code could result in messing up the commit history, but this seems not to be a big problem because, good or bad, the commit history of the Spider have not much information. Of course, I will use a config that doesn't change the existing code much in order to keep the history as far as possible. The code format should be checked by CI finally.



 Comments   
Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

We get quite large diffs with default styles.

> clang-format --version
clang-format version 10.0.0-4ubuntu1

> for f in *.cc *h; do diff $f <(clang-format $f --style=LLVM) | wc -l; done | awk '{s+=$0}END{print s}'
132136
vagrant@bld:~/.../storage/spider(10.7)
> for f in *.cc *h; do diff $f <(clang-format $f --style=Google) | wc -l; done | awk '{s+=$0}END{print s}'
127682
vagrant@bld:~/.../storage/spider(10.7)
> for f in *.cc *h; do diff $f <(clang-format $f --style=Chromium) | wc -l; done | awk '{s+=$0}END{print s}'
132520
> for f in *.cc *h; do diff $f <(clang-format $f --style=Mozilla) | wc -l; done | awk '{s+=$0}END{print s}'
137417
> for f in *.cc *h; do diff $f <(clang-format $f --style=Webkit) | wc -l; done | awk '{s+=$0}END{print s}'
201634
> for f in *.cc *h; do diff $f <(clang-format $f --style=Microsoft) | wc -l; done | awk '{s+=$0}END{print s}'
203659
> for f in *.cc *h; do diff $f <(clang-format $f --style=GNU) | wc -l; done | awk '{s+=$0}END{print s}'
206355

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

Applying clang-format to the existing code base would mess up the history too much. So, it might be better to use a tool like git-clang-format.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

More accurate calculations:

> for f in *.cc *h; do diff $f <(clang-format $f) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
54642
> for f in *.cc *h; do diff $f <(clang-format $f --style=LLVM) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
40763
> for f in *.cc *h; do diff $f <(clang-format $f --style=Google) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
37486
> for f in *.cc *h; do diff $f <(clang-format $f --style=Chromium) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
41048
> for f in *.cc *h; do diff $f <(clang-format $f --style=Mozilla) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
48977
> for f in *.cc *h; do diff $f <(clang-format $f --style=Webkit) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
84508
> for f in *.cc *h; do diff $f <(clang-format $f --style=Microsoft) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
84241
> for f in *.cc *h; do diff $f <(clang-format $f) | grep '^>' | wc -l ; done | awk '{s+=$0}END{print s}'
54642

This means that about 30% of the code is modified even by the Google style. We might optimize .clang-format to fit with the current codebase. However, I expect that at least 20-25% of the code would be modified be cause some existing styles are never supported by clang-format.

> find . -name '*.cc' -or -name '*.h' | xargs wc -l | tail -n1
 120048 total

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

I thought about whether or not we should introduce a code formatter. For contributors outside MariaDB Corporation/Foundation, it would be far better to have a mechanism to enforce a consistent style. Applying clang-format to the whole codebase of the Spider might reduce the readability of the commit history (when doing git blame), but I believe this is necessary sacrifice.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

I don't believe that there is any great superiority or inferiority in coding styles. What is important is that the coding style is defined. I think that adapting the coding style of MySQL would be a good option.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-08-02 ]

Another option is to use .clang-format at the project root.

Comment by Nayuta Yanagisawa (Inactive) [ 2021-10-16 ]

We could apply the .clang-format at the root of the repository for new commits and it is not allowed to modify the existing code. So, I close the present issue.

Generated at Thu Feb 08 09:44:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.