0

Reland "orderfile: Remove the patching step"

This reverts commit 3099fe8081.

Reason for revert: Patching did not improve performance https://crbug.com/340534475#comment22

Original change's description:
> Revert "orderfile: Remove the patching step"
>
> This reverts commit 9083892277.
>
> Reason for revert: performance regression on android-pixel6-perf-pgo: https://screenshot.googleplex.com/6yxbc7zZDcZbvKw
>
> Original change's description:
> > orderfile: Remove the patching step
> >
> > Remove the 'Patch Orderfile' step from the Orderfile Generator. Replace
> > it with a simpler step 'Add dummy functions'. It adds the following
> > symbols to the beginning and the end of the orderfile, just like the
> > current orderfile patching step does:
> > * dummy_function_start_of_ordered_text
> > * dummy_function_end_of_ordered_text
> >
> > The symbol `__cxx_global_var_init` is currently added along with the
> > symbols marking begin/end of the ordered address range. While the latter
> > symbols are important for correctness at runtime, the former does
> > not exist these days. It used to be a large and uninstrumented symbol.
> > Remove mentions of it in orderfiles.
> >
> > Note: This change does _not_ remove generation of the unpatched
> > orderfile and uploading it to the cloud storage. Leaving it as TODO for
> > a separate change. The only difference remaining between the normal and
> > the 'unpatched' orderfile after this change would be the presence of the
> > dummy functions mentioned above.
> >
> > == Observation(s) from running locally
> >
> > About 900 symbols from the orderfile that cannot be found in the
> > uninstrumented binary because of the known problem [1]. The current
> > 'patching' step does not fix this problem. There is, however, a
> > possibility to mitigate the problem with another style of patching,
> > which is not infra-friendly because it requires two builds with
> > different configuration on the same builder.
> >
> > A few dozen symbols from the orderfile appear at 2+ offsets in the
> > binary. I think it could be in part because of the DFM splitting, but I
> > am not sure. It would be nice to investigate. There is one example of
> > such symbol: ReadHuffmanCode. It is defined by brotli and libwebp (both
> > in third_party), and they are static symbols only referenced from the
> > same file where they are defined.
> >
> > [1] crbug.com/356360613:
> >     "Teach orderfile generator to respect suffixed function names
> >      from LTO"
> >
> > Bug: 340534475
> > Change-Id: Icddd3b84576cfe76bfb652fe856aac3d2701dce6
> > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5782742
> > Reviewed-by: Peter Wen <wnwen@chromium.org>
> > Commit-Queue: Peter Wen <wnwen@chromium.org>
> > Reviewed-by: Rasika Navarange <rasikan@google.com>
> > Cr-Commit-Position: refs/heads/main@{#1341171}
>
> Bug: 340534475
> Change-Id: I08d2865a5758c2318bf21730a0232f1465cfcc96
> Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5796046
> Commit-Queue: Egor Pasko <pasko@chromium.org>
> Auto-Submit: Egor Pasko <pasko@chromium.org>
> Commit-Queue: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
> Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
> Cr-Commit-Position: refs/heads/main@{#1343395}

Bug: 340534475
Change-Id: Icd0e22cb95011a30f266df31846588f5f503895f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5797100
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Auto-Submit: Peter Wen <wnwen@chromium.org>
Commit-Queue: Peter Wen <wnwen@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1343821}
This commit is contained in:
Peter Wen
2024-08-19 22:58:58 +00:00
committed by Chromium LUCI CQ
parent 9e8e752e72
commit a0204d5207
4 changed files with 55 additions and 391 deletions

@ -91,20 +91,10 @@ in [orderfile.py](../tools/perf/contrib/orderfile/orderfile.py). These profiles
are a list of function offsets into the binary that were called during execution
of the benchmarks.
3. **Cluster the symbols from the profiles to generate the unpatched orderfile.**
3. **Cluster the symbols from the profiles to generate the orderfile.**
The offsets are processed and merged using a
[clustering](../tools/cygprofile/cluster.py) algorithm to produce an orderfile.
4. **Build an uninstrumented Chrome and patch the orderfile with it.** The
orderfile based on an instrumented build cannot be applied directly to an
uninstrumented build. The orderfile needs to be
[patched](../tools/cygprofile/patch_orderfile.py) with an uninstrumented build
because the instrumentation has a non-trivial impact on inlining decisions and
has identical code folding disabled. The patching step produces the final
orderfile which will be in `clank/orderfiles/` for internal builds, or in
`orderfiles/` if running the generator script with `--public`. The
uninstrumented build will be in `out/orderfile_arm64_uninstrumented_out`.
5. **Run benchmarks on the final orderfile.** We run some benchmarks to compare
4. **Run benchmarks on the final orderfile.** We run some benchmarks to compare
the performance with/without the orderfile. You can supply the `--no-benchmark`
flag to skip this step.

@ -3,15 +3,14 @@
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
""" A utility to generate an up-to-date orderfile.
""" A utility to generate an orderfile.
The orderfile is used by the linker to order text sections such that the
sections are placed consecutively in the order specified. This allows us
to page in less code during start-up.
The orderfile is used by the linker to order symbols such that they
are placed consecutively. See //docs/orderfile.md.
Example usage:
tools/cygprofile/orderfile_generator_backend.py --use-remoteexec \
--target-arch=arm
--target-arch=arm64
"""
@ -33,7 +32,6 @@ import time
from typing import Dict, List
import cluster
import patch_orderfile
import process_profiles
import profile_android_startup
@ -61,6 +59,16 @@ _ARCH_GN_ARGS = {
_RESULTS_KEY_SPEEDOMETER = 'Speedometer2.0'
def _ReadNonEmptyStrippedFromFile(file_name):
stripped_lines = []
with open(file_name, 'r') as file:
for line in file:
stripped_line = line.strip()
if stripped_line:
stripped_lines.append(stripped_line)
return stripped_lines
class CommandError(Exception):
"""Indicates that a dispatched shell command exited with a non-zero status."""
@ -591,9 +599,6 @@ class OrderfileGenerator:
assert not options.profile_save_dir, (
'--profile-save-dir cannot be used with --skip-profile')
# Outlined function handling enabled by default for all architectures.
self._order_outlined_functions = not options.noorder_outlined_functions
self._output_data = {}
self._step_recorder = StepRecorder()
self._compiler = None
@ -697,13 +702,21 @@ class OrderfileGenerator:
shutil.copytree(self._host_profile_root, self._options.profile_save_dir)
logging.info('Saved profiles')
def _PatchOrderfile(self):
"""Patches the orderfile using clean version of libchrome.so."""
self._step_recorder.BeginStep('Patch Orderfile')
def _AddDummyFunctions(self):
# TODO(crbug.com/340534475): Stop writing the `unpatched_orderfile` and
# uploading it to the cloud storage.
self._step_recorder.BeginStep('Add dummy functions')
assert self._compiler is not None
patch_orderfile.GeneratePatchedOrderfile(
self._GetUnpatchedOrderfileFilename(), self._compiler.lib_chrome_so,
self._GetPathToOrderfile(), self._order_outlined_functions)
symbols = _ReadNonEmptyStrippedFromFile(
self._GetUnpatchedOrderfileFilename())
with open(self._GetPathToOrderfile(), 'w') as f:
# Make sure the anchor functions are located in the right place, here and
# after everything else.
# See the comment in //base/android/library_loader/anchor_functions.cc.
f.write('dummy_function_start_of_ordered_text\n')
for sym in symbols:
f.write(sym + '\n')
f.write('dummy_function_end_of_ordered_text\n')
def _VerifySymbolOrder(self):
self._step_recorder.BeginStep('Verify Symbol Order')
@ -1020,30 +1033,20 @@ class OrderfileGenerator:
self._GenerateAndProcessProfile()
self._MaybeArchiveOrderfile(self._GetUnpatchedOrderfileFilename())
if self._options.patch:
if self._options.profile:
self._RemoveBlanks(self._GetUnpatchedOrderfileFilename(),
self._GetPathToOrderfile())
self._compiler = ClankCompiler(self._uninstrumented_out_dir,
self._step_recorder, self._options,
self._GetPathToOrderfile(),
self._native_library_build_variant)
self._compiler.CompileLibchrome(instrumented=False)
self._PatchOrderfile()
# Because identical code folding is a bit different with and without
# the orderfile build, we need to re-patch the orderfile with code
# folding as close to the final version as possible.
self._compiler.CompileLibchrome(instrumented=False,
force_relink=True)
self._PatchOrderfile()
self._compiler.CompileLibchrome(instrumented=False,
force_relink=True)
if self._VerifySymbolOrder():
self._MaybeArchiveOrderfile(self._GetPathToOrderfile(),
use_new_cloud=True)
else:
self._SaveForDebugging(self._GetPathToOrderfile())
if self._options.profile:
self._RemoveBlanks(self._GetUnpatchedOrderfileFilename(),
self._GetPathToOrderfile())
self._compiler = ClankCompiler(self._uninstrumented_out_dir,
self._step_recorder, self._options,
self._GetPathToOrderfile(),
self._native_library_build_variant)
self._AddDummyFunctions()
self._compiler.CompileLibchrome(instrumented=False, force_relink=False)
if self._VerifySymbolOrder():
self._MaybeArchiveOrderfile(self._GetPathToOrderfile(),
use_new_cloud=True)
else:
self._SaveForDebugging(self._GetPathToOrderfile())
if self._options.benchmark:
self._SaveBenchmarkResultsToOutput(
@ -1104,12 +1107,12 @@ def CreateArgumentParser():
parser.add_argument('--output-json', action='store', dest='json_file',
help='Location to save stats in json format')
parser.add_argument(
'--skip-profile', action='store_false', dest='profile', default=True,
'--skip-profile',
action='store_false',
dest='profile',
default=True,
help='Don\'t generate a profile on the device. Only patch from the '
'existing profile.')
parser.add_argument(
'--skip-patch', action='store_false', dest='patch', default=True,
help='Only generate the raw (unpatched) orderfile, don\'t patch it.')
parser.add_argument('--use-remoteexec',
action='store_true',
help='Enable remoteexec. see //build/toolchain/rbe.gni.',
@ -1141,9 +1144,6 @@ def CreateArgumentParser():
default=False,
help='Use the webview startup benchmark profiles to '
'generate the orderfile.')
parser.add_argument('--noorder-outlined-functions',
action='store_true',
help='Disable outlined functions in the orderfile.')
parser.add_argument('--pregenerated-profiles', default=None, type=str,
help=('Pregenerated profiles to use instead of running '
'profile step. Cannot be used with '
@ -1152,12 +1152,14 @@ def CreateArgumentParser():
help=('Directory to save any profiles created. These can '
'be used with --pregenerated-profiles. Cannot be '
'used with --skip-profiles.'))
parser.add_argument('--upload-ready-orderfiles', action='store_true',
parser.add_argument('--upload-ready-orderfiles',
action='store_true',
help=('Skip orderfile generation and manually upload '
'orderfiles (both patched and unpatched) from '
'their normal location in the tree to the cloud '
'storage. DANGEROUS! USE WITH CARE!'))
parser.add_argument('--streamline-for-debugging', action='store_true',
'the orderfile from its normal location in '
'the tree to the cloud storage. '
'DANGEROUS! USE WITH CARE!'))
parser.add_argument('--streamline-for-debugging',
action='store_true',
help=('Streamline where possible the run for faster '
'iteration while debugging. The orderfile '
'generated will be valid and nontrivial, but '

@ -1,262 +0,0 @@
#!/usr/bin/env vpython3
# Copyright 2013 The Chromium Authors
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
"""Patch an orderfile.
Starting with a list of symbols in a binary and an orderfile (ordered list of
symbols), matches the symbols in the orderfile and augments each symbol with
the symbols residing at the same address (due to having identical code). The
output is a list of symbols appropriate for the linker
option --symbol-ordering-file for lld. Note this is not usable with gold (which
uses section names to order the binary).
Note: It is possible to have.
- Several symbols mapping to the same offset in the binary.
- Several offsets for a given symbol (because we strip the ".clone." and other
suffixes)
The general pipeline is:
1. Get the symbol infos (name, offset, size, section) from the binary
2. Get the symbol names from the orderfile
3. Find the orderfile symbol names in the symbols coming from the binary
4. For each symbol found, get all the symbols at the same address
5. Output them to an updated orderfile suitable lld
"""
import argparse
import logging
import re
import sys
import symbol_extractor
# Suffixes for symbols. These are due to method splitting for inlining and
# method cloning for various reasons including constant propagation and
# inter-procedural optimization.
_SUFFIXES = ('.clone.', '.part.', '.isra.', '.constprop.')
# The pattern and format for a linker-generated outlined function.
_OUTLINED_FUNCTION_RE = re.compile(r'OUTLINED_FUNCTION_(?P<index>\d+)$')
_OUTLINED_FUNCTION_FORMAT = 'OUTLINED_FUNCTION_{}'
def RemoveSuffixes(name):
"""Strips method name suffixes from cloning and splitting.
.clone. comes from cloning in -O3.
.part. comes from partial method splitting for inlining.
.isra. comes from inter-procedural optimizations.
.constprop. is cloning for constant propagation.
"""
for suffix in _SUFFIXES:
name = name.split(suffix)[0]
return name
def _UniqueGenerator(generator):
"""Converts a generator to skip yielding elements already seen.
Example:
@_UniqueGenerator
def Foo():
yield 1
yield 2
yield 1
yield 3
Foo() yields 1,2,3.
"""
def _FilteringFunction(*args, **kwargs):
returned = set()
for item in generator(*args, **kwargs):
if item in returned:
continue
returned.add(item)
yield item
return _FilteringFunction
def _GroupSymbolsByOffset(binary_filename):
"""Produce a map symbol name -> all symbol names at same offset.
Suffixes are stripped.
"""
symbol_infos = [
s._replace(name=RemoveSuffixes(s.name))
for s in symbol_extractor.SymbolInfosFromBinary(binary_filename)]
offset_map = symbol_extractor.GroupSymbolInfosByOffset(symbol_infos)
missing_offsets = 0
sym_to_matching = {}
for sym in symbol_infos:
if sym.offset not in offset_map:
missing_offsets += 1
continue
matching = [s.name for s in offset_map[sym.offset]]
assert sym.name in matching
sym_to_matching[sym.name] = matching
return sym_to_matching
def _GetMaxOutlinedIndex(sym_dict):
"""Find the largest index of an outlined functions.
See _OUTLINED_FUNCTION_RE for the definition of the index. In practice the
maximum index equals the total number of outlined functions. This function
asserts that the index is near the total number of outlined functions.
Args:
sym_dict: Dict with symbol names as keys.
Returns:
The largest index of an outlined function seen in the keys of |sym_dict|.
"""
seen = set()
for sym in sym_dict:
m = _OUTLINED_FUNCTION_RE.match(sym)
if m:
seen.add(int(m.group('index')))
if not seen:
return None
max_index = max(seen)
# Assert that the number of outlined functions is reasonable compared to the
# indices we've seen. At the time of writing, outlined functions are indexed
# consecutively from 0. If this radically changes, then other outlining
# behavior may have changed to violate some assumptions.
assert max_index < 2 * len(seen)
return max_index
def _StripSuffixes(section_list):
"""Remove all suffixes on items in a list of symbols."""
return [RemoveSuffixes(section) for section in section_list]
def _PatchedSymbols(symbol_to_matching, profiled_symbols, max_outlined_index):
"""Internal computation of an orderfile.
Args:
symbol_to_matching: ({symbol name -> [symbols at same offset]}), as from
_GroupSymbolsByOffset.
profiled_symbols: ([symbol names]) as from the unpatched orderfile.
max_outlined_index: (int or None) if not None, add outlined function names
to the end of the patched orderfile.
Yields:
Patched symbols, in a consistent order to profiled_symbols.
"""
missing_symbol_count = 0
seen_symbols = set()
for sym in profiled_symbols:
if _OUTLINED_FUNCTION_RE.match(sym):
continue
if sym in seen_symbols:
continue
if sym not in symbol_to_matching:
missing_symbol_count += 1
continue
for matching in symbol_to_matching[sym]:
if matching in seen_symbols:
continue
if _OUTLINED_FUNCTION_RE.match(matching):
continue
yield matching
seen_symbols.add(matching)
assert sym in seen_symbols
logging.warning('missing symbol count = %d', missing_symbol_count)
if max_outlined_index is not None:
# The number of outlined functions may change with each build, so only
# ordering the outlined functions currently in the binary will not
# guarantee ordering after code changes before the next orderfile is
# generated. So we double the number of outlined functions as a measure of
# security.
for idx in range(2 * max_outlined_index + 1):
yield _OUTLINED_FUNCTION_FORMAT.format(idx)
@_UniqueGenerator
def ReadOrderfile(orderfile):
"""Reads an orderfile and cleans up symbols.
Args:
orderfile: The name of the orderfile.
Yields:
Symbol names, cleaned and unique.
"""
with open(orderfile) as f:
for line in f:
line = line.strip()
if line:
yield line
def GeneratePatchedOrderfile(unpatched_orderfile: str,
native_lib_filename: str,
output_filename: str,
order_outlined: bool = False):
"""Writes a patched orderfile.
Args:
unpatched_orderfile: Path to the unpatched orderfile.
native_lib_filename: Path to the native library.
output_filename: Path to the patched orderfile.
order_outlined: If outlined function symbols are present in the
native library, then add ordering of them to the orderfile. If there
are no outlined function symbols present then this flag has no effect.
"""
symbol_to_matching = _GroupSymbolsByOffset(native_lib_filename)
if order_outlined:
max_outlined_index = _GetMaxOutlinedIndex(symbol_to_matching)
if not max_outlined_index:
# Only generate ordered outlined functions if they already appeared in
# the library.
max_outlined_index = None
else:
max_outlined_index = None # Ignore outlining.
profiled_symbols = ReadOrderfile(unpatched_orderfile)
with open(output_filename, 'w') as f:
# Make sure the anchor functions are located in the right place, here and
# after everything else.
# See the comment in //base/android/library_loader/anchor_functions.cc.
#
# __cxx_global_var_init is one of the largest symbols (~38kB as of May
# 2018), called extremely early, and not instrumented.
for first_section in ('dummy_function_start_of_ordered_text',
'__cxx_global_var_init'):
f.write(first_section + '\n')
for sym in _PatchedSymbols(symbol_to_matching, profiled_symbols,
max_outlined_index):
f.write(sym + '\n')
f.write('dummy_function_end_of_ordered_text\n')
def _CreateArgumentParser():
"""Creates and returns the argument parser."""
parser = argparse.ArgumentParser()
parser.add_argument('--target-arch', help='Unused')
parser.add_argument('--unpatched-orderfile', required=True,
help='Path to the unpatched orderfile')
parser.add_argument('--native-library', required=True,
help='Path to the native library')
parser.add_argument('--output-file', required=True, help='Output filename')
return parser
def main():
parser = _CreateArgumentParser()
options = parser.parse_args()
GeneratePatchedOrderfile(options.unpatched_orderfile, options.native_library,
options.output_file)
return 0
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO)
sys.exit(main())

@ -1,66 +0,0 @@
#!/usr/bin/env vpython3
# Copyright 2015 The Chromium Authors
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
import unittest
import patch_orderfile
import symbol_extractor
class TestPatchOrderFile(unittest.TestCase):
def testRemoveSuffixes(self):
no_clone = 'this.does.not.contain.clone'
self.assertEquals(no_clone, patch_orderfile.RemoveSuffixes(no_clone))
with_clone = 'this.does.contain.clone.'
self.assertEquals(
'this.does.contain', patch_orderfile.RemoveSuffixes(with_clone))
with_part = 'this.is.a.part.42'
self.assertEquals(
'this.is.a', patch_orderfile.RemoveSuffixes(with_part))
def testUniqueGenerator(self):
@patch_orderfile._UniqueGenerator
def TestIterator():
yield 1
yield 2
yield 1
yield 3
self.assertEqual(list(TestIterator()), [1,2,3])
def testMaxOutlinedIndex(self):
self.assertEquals(7, patch_orderfile._GetMaxOutlinedIndex(
{'OUTLINED_FUNCTION_{}'.format(idx): None
for idx in [1, 2, 3, 7]}))
self.assertRaises(AssertionError, patch_orderfile._GetMaxOutlinedIndex,
{'OUTLINED_FUNCTION_{}'.format(idx): None
for idx in [1, 200, 3, 11]})
self.assertEquals(None, patch_orderfile._GetMaxOutlinedIndex(
{'a': None, 'b': None}))
def testPatchedSymbols(self):
# From input symbols a b c d, symbols a and d match themselves, symbol
# b matches b and x, and symbol c is missing.
self.assertEquals(list('abxd'),
list(patch_orderfile._PatchedSymbols(
{'a': 'a', 'b': 'bx', 'd': 'd'},
'abcd', None)))
def testPatchedSymbolsWithOutlining(self):
# As above, but add outlined functions at the end. The aliased outlined
# function should be ignored.
self.assertEquals(
list('abd') + ['OUTLINED_FUNCTION_{}'.format(i) for i in range(5)],
list(
patch_orderfile._PatchedSymbols(
{
'a': 'a',
'b': ['b', 'OUTLINED_FUNCTION_4'],
'd': 'd'
}, ['a', 'b', 'OUTLINED_FUNCTION_2', 'c', 'd'], 2)))
if __name__ == '__main__':
unittest.main()