Simple AST rewriter example
This is an implementation of an AST rewriter based on clang's AST matchers, designed to be usable as a template for learning how to make such rewriters/beginning new ones. The rewriter itself replaces instances of `b ? "true" : "false"` with `base::ToString(b)` in C++ code, and was used to make several contributions to crbug.com/335797528. Change-Id: I0602694ed1acb424cc3e5eb17a8e89837ded4350 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6322829 Commit-Queue: Devon Loehr <dloehr@google.com> Reviewed-by: Daniel Cheng <dcheng@chromium.org> Cr-Commit-Position: refs/heads/main@{#1430219}
This commit is contained in:

committed by
Chromium LUCI CQ

parent
7dbbb28a9c
commit
3b59be9469
docs
tools/clang/ast_rewriter
@ -30,28 +30,38 @@ For convenience, add `third_party/llvm-build/Release+Asserts/bin` to `$PATH`.
|
||||
|
||||
LLVM uses C++11 and CMake. Source code for Chromium clang tools lives in
|
||||
[//tools/clang]. It is generally easiest to use one of the already-written tools
|
||||
as the base for writing a new tool.
|
||||
as the base for writing a new tool; the tool in [//tools/clang/ast_rewriter] is
|
||||
designed for this purpose, and includes explanations its major parts.
|
||||
|
||||
Chromium clang tools generally follow this pattern:
|
||||
|
||||
1. Instantiate a
|
||||
[`clang::ast_matchers::MatchFinder`][clang-docs-match-finder].
|
||||
2. Call `addMatcher()` to register
|
||||
2. Develop one or most [AST matchers][clang-matcher-tutorial] to locate the
|
||||
patterns of interest.
|
||||
1. `clang-query` is of great use for this part
|
||||
3. Create a subclass of
|
||||
[`clang::ast_matchers::MatchFinder::MatchCallback`][clang-docs-match-callback]
|
||||
actions to execute when [matching][matcher-reference] the AST.
|
||||
to determine what actions to take on each match, and register it with
|
||||
`addMatcher()`.
|
||||
3. Create a new `clang::tooling::FrontendActionFactory` from the `MatchFinder`.
|
||||
4. Run the action across the specified files with
|
||||
[`clang::tooling::ClangTool::run`][clang-docs-clang-tool-run].
|
||||
5. Serialize generated [`clang::tooling::Replacement`][clang-docs-replacement]s
|
||||
to `stdout`.
|
||||
|
||||
Other useful references when writing the tool:
|
||||
Useful references when writing the tool:
|
||||
|
||||
* [Clang doxygen reference][clang-docs]
|
||||
* [Tutorial for building tools using LibTooling and
|
||||
LibASTMatchers][clang-tooling-tutorial]
|
||||
* [Tutorial for AST matchers][clang-matcher-tutorial]
|
||||
* [AST matcher reference][matcher-reference]
|
||||
|
||||
### Edit serialization format
|
||||
Tools do not directly edit files; rather, they output a series of _edits_ to be
|
||||
applied later, which have the following format:
|
||||
|
||||
```
|
||||
==== BEGIN EDITS ====
|
||||
r:::path/to/file/to/edit:::offset1:::length1:::replacement text
|
||||
@ -103,7 +113,8 @@ tools/clang/scripts/build.py --bootstrap --without-android --without-fuchsia \
|
||||
Running this command builds the [Oilpan plugin][//tools/clang/blink_gc_plugin],
|
||||
the [Chrome style plugin][//tools/clang/plugins], and the [Blink to Chrome style
|
||||
rewriter][//tools/clang/rewrite_to_chrome_style]. Additional arguments to
|
||||
`--extra-tools` should be the name of subdirectories in [//tools/clang].
|
||||
`--extra-tools` should be the name of subdirectories in [//tools/clang]. The
|
||||
tool binary will be located in `third_party/llvm-build/Release+Asserts/bin`.
|
||||
|
||||
It is important to use --bootstrap as there appear to be [bugs](https://crbug.com/580745)
|
||||
in the clang library this script produces if you build it with gcc, which is the default.
|
||||
@ -192,12 +203,18 @@ clang++ -Xclang -ast-dump -std=c++14 foo.cc | less -R
|
||||
```
|
||||
|
||||
Using `clang-query` to dynamically test matchers (requires checking out
|
||||
and building [clang-tools-extra][]):
|
||||
and building [clang-tools-extra][]; this should happen automatically).
|
||||
The binary is located in `third_party/llvm-build/Release+Asserts/bin`:
|
||||
|
||||
```shell
|
||||
clang-query -p path/to/compdb base/memory/ref_counted.cc
|
||||
```
|
||||
|
||||
If you're running it on a test file instead of a real one, the compdb is
|
||||
optional; it will complain but it still works. Test matchers against the
|
||||
specified file by running `match <matcher>`, or simply `m <matcher>`. Use of
|
||||
`rlwrap` is highly recommended.
|
||||
|
||||
`printf` debugging:
|
||||
|
||||
```c++
|
||||
@ -229,6 +246,7 @@ When `--apply-edits` switch is not presented, tool outputs are compared to
|
||||
that in this case, only one test file is expected.
|
||||
|
||||
[//tools/clang]: https://chromium.googlesource.com/chromium/src/+/main/tools/clang/
|
||||
[//tools/clang/ast_rewriter]: https://chromium.googlesource.com/chromium/src/+/main/tools/clang/ast_rewriter
|
||||
[clang-docs-match-finder]: http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder.html
|
||||
[clang-docs-match-callback]: http://clang.llvm.org/doxygen/classclang_1_1ast__matchers_1_1MatchFinder_1_1MatchCallback.html
|
||||
[matcher-reference]: http://clang.llvm.org/docs/LibASTMatchersReference.html
|
||||
@ -239,4 +257,5 @@ that in this case, only one test file is expected.
|
||||
[//tools/clang/blink_gc_plugin]: https://chromium.googlesource.com/chromium/src/+/main/tools/clang/blink_gc_plugin/
|
||||
[//tools/clang/plugins]: https://chromium.googlesource.com/chromium/src/+/main/tools/clang/plugins/
|
||||
[//tools/clang/rewrite_to_chrome_style]: https://chromium.googlesource.com/chromium/src/+/main/tools/clang/rewrite_to_chrome_style/
|
||||
[clang-tools-extra]: (https://github.com/llvm-mirror/clang-tools-extra)
|
||||
[clang-tools-extra]: (https://clang.llvm.org/extra/index.html)
|
||||
[clang-matcher-tutorial]: (https://clang.llvm.org/docs/LibASTMatchers.html#astmatchers-writing)
|
||||
|
265
tools/clang/ast_rewriter/ASTRewriter.cpp
Normal file
265
tools/clang/ast_rewriter/ASTRewriter.cpp
Normal file
@ -0,0 +1,265 @@
|
||||
// Copyright 2025 The Chromium Authors
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
//
|
||||
// Clang tool to perform simple rewrites of C++ code using clang's AST matchers.
|
||||
// For more general documentation, as well as building & running instructions,
|
||||
// see
|
||||
// https://chromium.googlesource.com/chromium/src/+/HEAD/docs/clang_tool_refactoring.md
|
||||
//
|
||||
// As implemented, this tool looks for instances of `b ? "true" : "false"` and
|
||||
// replaces them with calls to `base::ToString`.
|
||||
//
|
||||
// If you want to create your own tool based on this one:
|
||||
// 1. Copy the ast_rewriter directory, and update CMakeLists.txt appropriately
|
||||
// 2. Follow the building and running procedure described in
|
||||
// the linked documentation:
|
||||
// a. Bootstrap the plugin
|
||||
// b. Build chrome once normally, without precompiled headers
|
||||
// c. Run using run_tool.py
|
||||
// 3. Perform any post-processing of the generated directives using dedup.py
|
||||
// 4. Apply the directives as described in the linked documentation
|
||||
//
|
||||
// Note: When running the tool, you may get spurious warnings due to chromium-
|
||||
// specific changes (e.g. #pragma allow_unsafe_buffers) that aren't. If so,
|
||||
// it's easiest to disable -Werror in build/config/compiler.gni (set
|
||||
// treat_warnings_as_errors = false). You may also want to disable the warning
|
||||
// entirely while running the tool, by adding "-Wno-unknown-pragmas" to
|
||||
// cflags_cc in an appropriate part of build/config/BUILD.gn. Make sure to
|
||||
// rebuild the project (repeat step 2b) after changing the build config.
|
||||
|
||||
#include <string>
|
||||
|
||||
#include "OutputHelper.h"
|
||||
#include "clang/AST/ASTContext.h"
|
||||
#include "clang/ASTMatchers/ASTMatchFinder.h"
|
||||
#include "clang/ASTMatchers/ASTMatchers.h"
|
||||
#include "clang/ASTMatchers/ASTMatchersMacros.h"
|
||||
#include "clang/Basic/SourceManager.h"
|
||||
#include "clang/Frontend/CompilerInstance.h"
|
||||
#include "clang/Frontend/FrontendActions.h"
|
||||
#include "clang/Lex/Lexer.h"
|
||||
#include "clang/Tooling/CommonOptionsParser.h"
|
||||
#include "clang/Tooling/Tooling.h"
|
||||
#include "llvm/ADT/StringRef.h"
|
||||
#include "llvm/Support/CommandLine.h"
|
||||
#include "llvm/Support/FormatVariadic.h"
|
||||
#include "llvm/Support/TargetSelect.h"
|
||||
|
||||
// Prints a clang::SourceLocation or clang::SourceRange.
|
||||
// Most AST types also have a dump() function to print to stderr.
|
||||
#define LOG(e) \
|
||||
llvm::errs() << __FILE__ << ":" << __LINE__ << ": " << #e << " " \
|
||||
<< (e).printToString(*result.SourceManager) << '\n';
|
||||
|
||||
namespace {
|
||||
|
||||
// Setting up the command-line; you can add additional options here if needed
|
||||
static llvm::cl::OptionCategory rewriter_category("ast_rewriter options");
|
||||
llvm::cl::extrahelp common_help(
|
||||
clang::tooling::CommonOptionsParser::HelpMessage);
|
||||
llvm::cl::extrahelp more_help(
|
||||
"This tool replaces instances of `b ? \"true\" : \"false\"` into"
|
||||
"`base::ToString(b)`");
|
||||
|
||||
using namespace clang;
|
||||
using namespace clang::ast_matchers;
|
||||
|
||||
// Specify what code patterns you're looking for here. AST matchers have more
|
||||
// complete documentation on the clang website: see
|
||||
// https://clang.llvm.org/docs/LibASTMatchers.html
|
||||
// and
|
||||
// https://clang.llvm.org/docs/LibASTMatchersReference.html
|
||||
//
|
||||
// This particular matcher looks for ternary operators whose second and third
|
||||
// operators are "true" and "false", e.g. `b ? "true" : "false"`.
|
||||
// Unfortunately, the matchers clang supports are incomplete; it can't directly
|
||||
// check string contents, but it can check string length. Fortunately, we can
|
||||
// perform additional checks on the AST itself once we have a potential match.
|
||||
// Therefore, it's usually best to write a general matcher, and narrow down the
|
||||
// final results later.
|
||||
//
|
||||
// The general process for creating a new matcher is to follow the AST matcher
|
||||
// link above, then manually sift through the gigantic listing to determine
|
||||
// which matchers (if any) fit your use case. It is strongly recommended to use
|
||||
// clang-query to test matchers dynamically until you've got them working the
|
||||
// way you want; see the clang_tool_refactoring.md file for more information.
|
||||
//
|
||||
// Arguments to a matcher are sub-matchers that serve to narrow down matches.
|
||||
// Some arguments (stmt(), expr(), etc) don't narrow at all, but provide a way
|
||||
// to reference different parts of the match. These arguments can be bound
|
||||
// by calling .bind() with a string; this allows the part of the match to be
|
||||
// referenced later by passing that string.
|
||||
//
|
||||
// The various kinds of matchers and the way they're expected to be combined is
|
||||
// complicated; the best way to learn about it is to read the docs and play
|
||||
// around with clang-query.
|
||||
StatementMatcher matchTernaryTrueFalse() {
|
||||
return conditionalOperator( // Matches ternary boolean operators ( _ ? _ : _)
|
||||
stmt().bind(
|
||||
"root"), // Bind the cond operator itself so we can refer to it
|
||||
hasCondition(
|
||||
expr().bind("cond")), // Bind just the condition, same reason
|
||||
|
||||
// Check that the true and false branches are if they're string literals
|
||||
// of length 4 and 5, and bind them. Match either order to account for `b
|
||||
// ? "false" : "true"`
|
||||
hasTrueExpression(
|
||||
expr(anyOf(stringLiteral(hasSize(4)), stringLiteral(hasSize(5))))
|
||||
.bind("tru")),
|
||||
hasFalseExpression(
|
||||
expr(anyOf(stringLiteral(hasSize(4)), stringLiteral(hasSize(5))))
|
||||
.bind("fls")));
|
||||
}
|
||||
|
||||
const char* headers_to_add[] = {"base/strings/to_string.h"};
|
||||
|
||||
// Once you know what you're looking for, the next step is to specify what to do
|
||||
// when you find it. This can be done by creating a class which inherits from
|
||||
// MatchFinder::MatchCallback and implements the `run` function.
|
||||
//
|
||||
// The Printer class is a minimal example: it takes the result of the matcher,
|
||||
// pulls out whatever was bound to "root", and dumps it to the screen. Good for
|
||||
// debugging, although clang-query is better for debugging the matcher itself.
|
||||
class Printer : public MatchFinder::MatchCallback {
|
||||
public:
|
||||
virtual void run(const MatchFinder::MatchResult& Result) override {
|
||||
// Only works if the matcher bound a Stmt to the name "root".
|
||||
if (const Stmt* FS = Result.Nodes.getNodeAs<clang::Stmt>("root")) {
|
||||
FS->dump();
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
// The ASTRewriter class is a more interesting example; in addition to the `run`
|
||||
// function, it stores an OutputHelper, which will also be passed to clang's
|
||||
// FrontendFactory. The factory will ensure that the OutputHelper's setup and
|
||||
// teardown methods are invoked at the beginning/end of each run, so our
|
||||
// rewriter can safely call it to emit output.
|
||||
class ASTRewriter : public MatchFinder::MatchCallback {
|
||||
protected:
|
||||
OutputHelper& output_helper_;
|
||||
|
||||
public:
|
||||
explicit ASTRewriter(OutputHelper* output_helper)
|
||||
: output_helper_(*output_helper) {}
|
||||
|
||||
// Replaces `b ? "true" : "false"` with base::ToString(b).
|
||||
// This function has access to the full power of clang's AST, so
|
||||
// you can do as much work as you want. Unfortunately, much like matchers, the
|
||||
// best way to figure out what AST methods are available to you is to sift
|
||||
// through the documentation (https://clang.llvm.org/doxygen/) for whatever
|
||||
// classes you have at hand, and hope you find something applicable to your
|
||||
// situation.
|
||||
virtual void run(const MatchFinder::MatchResult& result) override {
|
||||
ASTContext* Context = result.Context;
|
||||
// Extract the entire statement we matched.
|
||||
const Stmt* root = result.Nodes.getNodeAs<ConditionalOperator>("root");
|
||||
if (!root) {
|
||||
return;
|
||||
}
|
||||
|
||||
// Don't replace in macros
|
||||
// Things WILL go wrong if you try
|
||||
// Just do them by hand
|
||||
if (root->getBeginLoc().isMacroID()) {
|
||||
return;
|
||||
}
|
||||
|
||||
// Don't replace in third-party code, or in the function we're replacing
|
||||
// things with.
|
||||
StringRef filename =
|
||||
Context->getSourceManager().getFilename(root->getBeginLoc());
|
||||
if (filename.contains("third_party/") ||
|
||||
filename.contains("base/strings/to_string.h")) {
|
||||
return;
|
||||
}
|
||||
|
||||
// Extract the various components that we care about.
|
||||
const Expr* cond = result.Nodes.getNodeAs<Expr>("cond");
|
||||
const StringLiteral* tru = result.Nodes.getNodeAs<StringLiteral>("tru");
|
||||
const StringLiteral* fls = result.Nodes.getNodeAs<StringLiteral>("fls");
|
||||
if (!cond || !tru || !fls) {
|
||||
return;
|
||||
}
|
||||
|
||||
bool true_is_first = false;
|
||||
if (!tru->getString().compare("true") &&
|
||||
!fls->getString().compare("false")) {
|
||||
// "true" : "false"
|
||||
true_is_first = true;
|
||||
} else if (!tru->getString().compare("false") &&
|
||||
!fls->getString().compare("true")) {
|
||||
// "false" : "true"
|
||||
true_is_first = false;
|
||||
} else {
|
||||
return;
|
||||
}
|
||||
|
||||
// An example of something more complicated that we can't easily do with
|
||||
// matchers: See if the original expression was parenthesized,
|
||||
// and remove the parens as well if so.
|
||||
const auto& parents = Context->getParents(*root);
|
||||
if (!parents.empty()) {
|
||||
const Stmt* paren_root = parents[0].get<ParenExpr>();
|
||||
if (paren_root) {
|
||||
root = paren_root;
|
||||
}
|
||||
}
|
||||
|
||||
// We use getTokenRange here because that seems to be the format returned by
|
||||
// getSourceRange.
|
||||
CharSourceRange root_range =
|
||||
CharSourceRange::getTokenRange(root->getSourceRange());
|
||||
CharSourceRange cond_range =
|
||||
CharSourceRange::getTokenRange(cond->getSourceRange());
|
||||
|
||||
// Compute the replacement text
|
||||
auto cond_text = Lexer::getSourceText(cond_range, *result.SourceManager,
|
||||
result.Context->getLangOpts());
|
||||
std::string cond_text_str = std::string(cond_text);
|
||||
if (!true_is_first) {
|
||||
cond_text_str = "!(" + cond_text_str + ")";
|
||||
}
|
||||
std::string replacement_text = "base::ToString(" + cond_text_str + ")";
|
||||
|
||||
// Will emit a directive to replace root_tange with replacement_text
|
||||
output_helper_.Replace(root_range, replacement_text, *result.SourceManager,
|
||||
result.Context->getLangOpts());
|
||||
}
|
||||
};
|
||||
|
||||
} // namespace
|
||||
|
||||
// Putting it all together: this function is mostly boilerplate that combines
|
||||
// the stuff we've already defined. The most interesting part is specifying the
|
||||
// traversal method to use; TK_IgnoreUnlessSpelledInSource will ignore most
|
||||
// implicit AST nodes that the user didn't write themselves. This is required
|
||||
// to use matchers unless you have a deep understanding of clang's AST.
|
||||
int main(int argc, const char* argv[]) {
|
||||
llvm::InitializeNativeTarget();
|
||||
llvm::InitializeNativeTargetAsmParser();
|
||||
|
||||
llvm::Expected<clang::tooling::CommonOptionsParser> options =
|
||||
clang::tooling::CommonOptionsParser::create(argc, argv,
|
||||
rewriter_category);
|
||||
assert(static_cast<bool>(options));
|
||||
clang::tooling::ClangTool tool(options->getCompilations(),
|
||||
options->getSourcePathList());
|
||||
|
||||
OutputHelper output_helper((llvm::StringSet<>(headers_to_add)));
|
||||
|
||||
MatchFinder match_finder;
|
||||
MatchFinder::MatchCallback* callback =
|
||||
// new Printer();
|
||||
new ASTRewriter(&output_helper);
|
||||
|
||||
StatementMatcher final_matcher =
|
||||
traverse(TK_IgnoreUnlessSpelledInSource, matchTernaryTrueFalse());
|
||||
// More complicated use cases may want to add multiple matchers and callbacks
|
||||
match_finder.addMatcher(final_matcher, callback);
|
||||
|
||||
std::unique_ptr<clang::tooling::FrontendActionFactory> factory =
|
||||
clang::tooling::newFrontendActionFactory(&match_finder, &output_helper);
|
||||
return tool.run(factory.get());
|
||||
}
|
30
tools/clang/ast_rewriter/CMakeLists.txt
Normal file
30
tools/clang/ast_rewriter/CMakeLists.txt
Normal file
@ -0,0 +1,30 @@
|
||||
set(LLVM_LINK_COMPONENTS
|
||||
BitReader
|
||||
MCParser
|
||||
Option
|
||||
X86AsmParser
|
||||
X86CodeGen
|
||||
)
|
||||
|
||||
add_llvm_executable(ast_rewriter
|
||||
ASTRewriter.cpp
|
||||
OutputHelper.h
|
||||
OutputHelper.cpp
|
||||
)
|
||||
|
||||
target_link_libraries(ast_rewriter
|
||||
clangAST
|
||||
clangASTMatchers
|
||||
clangAnalysis
|
||||
clangBasic
|
||||
clangDriver
|
||||
clangEdit
|
||||
clangFrontend
|
||||
clangLex
|
||||
clangParse
|
||||
clangSema
|
||||
clangSerialization
|
||||
clangTooling
|
||||
)
|
||||
|
||||
cr_install(TARGETS ast_rewriter RUNTIME DESTINATION bin)
|
1
tools/clang/ast_rewriter/OWNERS
Normal file
1
tools/clang/ast_rewriter/OWNERS
Normal file
@ -0,0 +1 @@
|
||||
dloehr@google.com
|
74
tools/clang/ast_rewriter/OutputHelper.cpp
Normal file
74
tools/clang/ast_rewriter/OutputHelper.cpp
Normal file
@ -0,0 +1,74 @@
|
||||
// Copyright 2025 The Chromium Authors
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
#include "OutputHelper.h"
|
||||
|
||||
void OutputHelper::Delete(const clang::CharSourceRange& replacement_range,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts) {
|
||||
Replace(replacement_range, "", source_manager, lang_opts);
|
||||
}
|
||||
|
||||
// Replaces `replacement_range` with `replacement_text`.
|
||||
void OutputHelper::Replace(const clang::CharSourceRange& replacement_range,
|
||||
std::string replacement_text,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts) {
|
||||
clang::tooling::Replacement replacement(source_manager, replacement_range,
|
||||
std::move(replacement_text),
|
||||
lang_opts);
|
||||
|
||||
llvm::StringRef file_path = replacement.getFilePath();
|
||||
if (file_path.empty()) {
|
||||
return;
|
||||
}
|
||||
|
||||
PrintReplacement(file_path, replacement.getOffset(), replacement.getLength(),
|
||||
replacement.getReplacementText());
|
||||
}
|
||||
// Inserts `lhs` and `rhs` to the left and right of `replacement_range`.
|
||||
void OutputHelper::Wrap(const clang::CharSourceRange& replacement_range,
|
||||
std::string_view lhs,
|
||||
std::string_view rhs,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts) {
|
||||
clang::tooling::Replacement replacement(source_manager, replacement_range, "",
|
||||
lang_opts);
|
||||
|
||||
llvm::StringRef file_path = replacement.getFilePath();
|
||||
if (file_path.empty()) {
|
||||
return;
|
||||
}
|
||||
|
||||
PrintReplacement(file_path, replacement.getOffset(), 0, lhs);
|
||||
PrintReplacement(file_path, replacement.getOffset() + replacement.getLength(),
|
||||
0, rhs);
|
||||
}
|
||||
|
||||
// This is run automatically when the tool is first invoked
|
||||
bool OutputHelper::handleBeginSource(clang::CompilerInstance& compiler) {
|
||||
const clang::FrontendOptions& frontend_options = compiler.getFrontendOpts();
|
||||
|
||||
// Validate our expectations about how this tool should be used
|
||||
assert((frontend_options.Inputs.size() == 1) &&
|
||||
"run_tool.py should invoke the rewriter one file at a time");
|
||||
const clang::FrontendInputFile& input_file = frontend_options.Inputs[0];
|
||||
assert(input_file.isFile() &&
|
||||
"run_tool.py should invoke the rewriter on actual files");
|
||||
|
||||
current_language_ = input_file.getKind().getLanguage();
|
||||
|
||||
// Report that we succeeded
|
||||
return true;
|
||||
}
|
||||
|
||||
// This is run automatically at the end of the file.
|
||||
void OutputHelper::handleEndSource() {
|
||||
for (auto& file : files_replaced_in_) {
|
||||
for (auto& header : headers_to_add_) {
|
||||
llvm::outs() << "include-user-header:::" << file.getKey()
|
||||
<< ":::-1:::-1:::" << header.getKey() << "\n";
|
||||
}
|
||||
}
|
||||
}
|
112
tools/clang/ast_rewriter/OutputHelper.h
Normal file
112
tools/clang/ast_rewriter/OutputHelper.h
Normal file
@ -0,0 +1,112 @@
|
||||
// Copyright 2025 The Chromium Authors
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
#ifndef TOOLS_CLANG_AST_REWRITER_OUTPUTHELPER_H_
|
||||
#define TOOLS_CLANG_AST_REWRITER_OUTPUTHELPER_H_
|
||||
|
||||
#include <string>
|
||||
|
||||
#include "clang/Tooling/Core/Replacement.h"
|
||||
#include "clang/Tooling/Tooling.h"
|
||||
#include "llvm/ADT/StringSet.h"
|
||||
|
||||
// This is a general helper class for emitting the substitution directives
|
||||
// consumed by apply_edits.py.
|
||||
// See
|
||||
// https://chromium.googlesource.com/chromium/src/+/HEAD/docs/clang_tool_refactoring.md
|
||||
// for general documentation on the format.
|
||||
//
|
||||
// From a consumer's perspective, the most important functions are `Delete`,
|
||||
// `Replace`, and `Wrap`, which each emit a substitution directive to stdout.
|
||||
// The class also maintains a list of headers to be added to every file where a
|
||||
// replacement occurred; the directives are emitted at the end of the file.
|
||||
//
|
||||
// For the most part, you should be able to re-use this class without any
|
||||
// changes. It's possible certain use cases may require more complex logic, if
|
||||
// e.g. you're doing multiple kinds of replacement at once, and different ones
|
||||
// need to add different sets of headers.
|
||||
//
|
||||
// The substitution directives all take a CharSourceRange as their primary
|
||||
// argument. Despite the name, these represent a range of _either_ characters or
|
||||
// tokens, as reported by their isTokenRange() method. Both versions store a
|
||||
// start and an end SourceLocation; in the 'char' case, these point to the first
|
||||
// and last character of the range, respectively. In the 'token' case, they
|
||||
// point to the first character in the first/last token of the range. This means
|
||||
// that typically they will point _before_ the last character of the range, e.g.
|
||||
// in the code "Foo + Bar", the end of a character range will point at 'r',
|
||||
// while the end of a token range will point at 'B'. In both cases, the start
|
||||
// of the range will be 'F'.
|
||||
//
|
||||
// From a usage perspective, the primary difference is which construction
|
||||
// function you should call. If you have character-granular information, then
|
||||
// call CharSourceRange::getCharRange; if you have token-level information, then
|
||||
// call CharSourceRange::getTokenRange.
|
||||
class OutputHelper : public clang::tooling::SourceFileCallbacks {
|
||||
public:
|
||||
OutputHelper() = default;
|
||||
~OutputHelper() = default;
|
||||
|
||||
OutputHelper(const OutputHelper&) = delete;
|
||||
OutputHelper& operator=(const OutputHelper&) = delete;
|
||||
|
||||
OutputHelper(llvm::StringSet<> headers_to_add)
|
||||
: headers_to_add_(std::move(headers_to_add)) {};
|
||||
|
||||
// Replaces `replacement_range` with `replacement_text`.
|
||||
void Replace(const clang::CharSourceRange& replacement_range,
|
||||
std::string replacement_text,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts);
|
||||
|
||||
// Deletes `replacement_range`.
|
||||
void Delete(const clang::CharSourceRange& replacement_range,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts);
|
||||
|
||||
// Inserts `lhs` and `rhs` to the left and right of `replacement_range`.
|
||||
void Wrap(const clang::CharSourceRange& replacement_range,
|
||||
std::string_view lhs,
|
||||
std::string_view rhs,
|
||||
const clang::SourceManager& source_manager,
|
||||
const clang::LangOptions& lang_opts);
|
||||
|
||||
private:
|
||||
// By inheriting from clang::tooling::SourceFileCallbacks, OutputHelper
|
||||
// automatically executes setup and teardown code at the beginning/end of each
|
||||
// file.
|
||||
|
||||
// This is run automatically when the tool is first invoked.
|
||||
bool handleBeginSource(clang::CompilerInstance& compiler) override;
|
||||
|
||||
// This is run automatically at the end of the file.
|
||||
void handleEndSource() override;
|
||||
|
||||
// Called by PrintReplacement to determine if we should actually replace in
|
||||
// this file.
|
||||
bool ShouldOutput() { return current_language_ == clang::Language::CXX; }
|
||||
|
||||
// Emit the requested replacement in the proper format.
|
||||
void PrintReplacement(llvm::StringRef file_path,
|
||||
unsigned offset,
|
||||
unsigned length,
|
||||
std::string_view replacement_text) {
|
||||
if (ShouldOutput()) {
|
||||
files_replaced_in_.insert(file_path);
|
||||
std::string final_text = std::string(replacement_text);
|
||||
// The rewriting format expects newlines to be replaced with \0
|
||||
std::replace(final_text.begin(), final_text.end(), '\n', '\0');
|
||||
llvm::outs() << "r:::" << file_path << ":::" << offset << ":::" << length
|
||||
<< ":::" << final_text << "\n";
|
||||
}
|
||||
}
|
||||
|
||||
// The language of the file we're currently looking at.
|
||||
clang::Language current_language_ = clang::Language::Unknown;
|
||||
// At the end, we'll add additional headers to each file we emitted a
|
||||
// replacement directive for.
|
||||
llvm::StringSet<> files_replaced_in_;
|
||||
llvm::StringSet<> headers_to_add_;
|
||||
};
|
||||
|
||||
#endif // TOOLS_CLANG_AST_REWRITER_OUTPUTHELPER_H_
|
99
tools/clang/ast_rewriter/dedup.py
Normal file
99
tools/clang/ast_rewriter/dedup.py
Normal file
@ -0,0 +1,99 @@
|
||||
# Copyright 2025 The Chromium Authors
|
||||
# Use of this source code is governed by a BSD-style license that can be
|
||||
# found in the LICENSE file.
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
###
|
||||
# The ASTRewriter plugin emits substitution directives independently for each
|
||||
# TU. This means there will be several duplicates (e.g. in headers that are
|
||||
# included in multiple source files). The format used for paths is also
|
||||
# inconsistent.
|
||||
#
|
||||
# This is a general-purpose post-processing script that can deduplicate,
|
||||
# filter edits based on path, add user headers if not already present, etc.
|
||||
# It also adds the begin/end tags when writing out the edits.
|
||||
#
|
||||
# usage: `python3 dedup.py directives.txt`
|
||||
###
|
||||
|
||||
### Configurable options
|
||||
# List of headers to add to each modified file
|
||||
headers_to_add = ["base/strings/to_string.h"]
|
||||
# List of paths we do/don't want to replace in
|
||||
paths_to_exclude = ["third_party"]
|
||||
paths_to_include = ["/components/", "/content/", "/chrome/"]
|
||||
|
||||
|
||||
# Paths we don't want to process
|
||||
def filter_path(path):
|
||||
"""
|
||||
Examine a path and return true if we want to filter it out,
|
||||
e.g. because it's in third_party. Feel free to customize the logic.
|
||||
"""
|
||||
if (any(exclude in path for exclude in paths_to_exclude)):
|
||||
return True
|
||||
|
||||
if (not any(include in path for include in paths_to_include)):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
### Actual work
|
||||
def ProcessFile(filename, deduped_contents, unique_paths):
|
||||
""" Read every replacement in a file, normalizing paths and removing
|
||||
duplicates, as well as any paths we choose to filter out. Keep track
|
||||
of all unique paths we see so we know which files to add headers to.
|
||||
|
||||
filename: the name of the file to be processed
|
||||
deduped_contents: the set of replacements we've already processed
|
||||
unique_paths: the set of unique replacement paths we've seen.
|
||||
"""
|
||||
with open(filename) as f:
|
||||
for line in f.readlines():
|
||||
parts = line.split(":::")
|
||||
if len(parts) < 2:
|
||||
print("Skipping unexpected line: ", line)
|
||||
continue
|
||||
path = os.path.normpath(parts[1])
|
||||
if filter_path(path):
|
||||
continue
|
||||
|
||||
if path not in unique_paths:
|
||||
unique_paths.add(path)
|
||||
|
||||
parts[1] = path
|
||||
new_line = ":::".join(parts)
|
||||
if new_line not in deduped_contents:
|
||||
deduped_contents.add(new_line)
|
||||
|
||||
|
||||
def DedupFiles(filenames):
|
||||
deduped_contents = set()
|
||||
unique_paths = set()
|
||||
|
||||
for file in filenames:
|
||||
ProcessFile(file, deduped_contents, unique_paths)
|
||||
|
||||
# This may not be necessary if the tool already emits these directives,
|
||||
# but sometimes that may be inconvenient.
|
||||
for path in unique_paths:
|
||||
for header in headers_to_add:
|
||||
deduped_contents.add(
|
||||
f"include-user-header:::{path}:::-1:::-1:::{header}\n")
|
||||
|
||||
output_file = "deduped.txt"
|
||||
WriteFile(output_file, sorted(deduped_contents))
|
||||
|
||||
|
||||
def WriteFile(outfile, lines):
|
||||
with open(outfile, "w") as f:
|
||||
f.write("==== BEGIN EDITS ====\n")
|
||||
f.write("".join(lines))
|
||||
f.write("==== END EDITS ====\n")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
DedupFiles(sys.argv[1:])
|
20
tools/clang/ast_rewriter/tests/cond-test.cc
Normal file
20
tools/clang/ast_rewriter/tests/cond-test.cc
Normal file
@ -0,0 +1,20 @@
|
||||
// Copyright 2025 The Chromium Authors
|
||||
// Use of this source code is governed by a BSD-style license that can be
|
||||
// found in the LICENSE file.
|
||||
|
||||
#pragma clang diagnostic ignored "-Wunused-variable"
|
||||
|
||||
#include <string>
|
||||
|
||||
void foo(bool b, int n) {
|
||||
auto x = b ? 1 : 2;
|
||||
auto y = b ? "true" : "fls";
|
||||
auto w = b ? "0" : "1";
|
||||
auto a = b ? "tluo" : "dalse";
|
||||
auto z = b ? "true" : "false";
|
||||
const char* z1 = b ? "true" : "false";
|
||||
const char* z2 = (b ? "true" : "false");
|
||||
std::string z3 = b ? "true" : "false";
|
||||
std::string z4 = b ? "false" : "true";
|
||||
std::string z5 = n == 5 ? "false" : "true";
|
||||
}
|
Reference in New Issue
Block a user