Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chromium Build Discussion #26

Open
RobRich999 opened this issue Oct 15, 2021 · 450 comments
Open

Chromium Build Discussion #26

RobRich999 opened this issue Oct 15, 2021 · 450 comments
Assignees

Comments

@RobRich999
Copy link
Owner

Discussion regarding Chromium builds and related topics.

@RobRich999 RobRich999 self-assigned this Oct 15, 2021
@RobRich999 RobRich999 pinned this issue Oct 15, 2021
@RobRich999 RobRich999 changed the title Chat Thread Chromium Build Discussion Oct 15, 2021
@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 15, 2021

@Alex313031 Windows AVX is thankfully one of the easiest edits to set SIMD level. In the WIndows build config:

//chromium/src/build/config/win/BUILD.gn

Simply search for -msse3 and replace with -mavx. Clang will propagate the -mavx flag to lld for LTO, so do not worry about an ldflag for AVX. Default SIMD levels aside, the default x86-64 instruction model in LLVM is actually Intel Sandy Bridge, which already includes modelling for AVX instructions when using -mavx.

If you are cross building under Linux for Windows, you need to do the same in the compiler config as well, just like if you were natively doing a Linux build. The reason is libc++ pulls its build config from there.

//chromium/src/build/config/compiler/BUILD.gn

Make sure you disable AVX in the tflite config. It will not build successfully with AVX and above.

//chromium/src/third_party/tflite/BUILD.gn

Add the following cflags to the tflite config for a x86-64 build:

"-march=x86-64",
"-msse3",
"-mno-avx",
"-mno-avx2",
"-mno-fma",

I have updated the Widevine args. Windows builds just pushed. The Bitmovin demo works.

https://bitmovin.com/demos/drm

Awaiting a reroll of Linux builds after a Polly config update. Sadly nothing new there, which is one of the reasons I tend to tell people trying Polly to use a very basic Polly config to start.


I have a new Ryzen 8C/16T notebook on order. I would not be surprised if it turns out (much) faster for building Chromium than my two old 32-core Opteron systems.

@Rusenche
Copy link

Hello

  1. Why do not you target your efforts to build full portable versions (all files to be stored only in the current directory)?!

  2. Why do not you focus your attention and efforts to build sufficiently protected chromium as a Ungoogled?!

I tried Win-32 folder in the archive you've uploaded to the repository here. It turned out to be x64. I started chromium version 97 and the account folder is created in "\AppData\Local" - which for me is totally unacceptable.

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 15, 2021

My personal interest is optimizing Chromium performance. I do not care much about portable releases, and I have no personal interest in ungoogled patches. Long story short, ya'll get the same builds I use. ;)


Straight copy-and-paste from chrome://version in the current Win32 zip archive:

https://github.com/RobRich999/Chromium_Clang/releases/download/v97.0.4670.0-r931446-win32-sse3/chrome.zip

Chromium 97.0.4670.0 (Official Build) (32-bit)
Revision b112428b14eb9a6a78dcf880b2c17236ecaad8a4-refs/heads/main@{#931446}

That said, you might have snagged the Win32 build that was up for awhile yesterday, and it might have been packaged or pushed incorrectly. Happens occasionally. Anyway, it was already well outdated when pushed, so I went ahead and simply removed it to be sure.


Jerry (woolyss) has a server-side script to generate portable releases (using crlauncher) of submitted builds. You might look into that if interested:

https://chromium.woolyss.com/

@Rusenche
Copy link

Rusenche commented Oct 15, 2021

Exactly from his site I came with a link to your repository. ;)

In my opinion, the browser should not be as fast as secure, so I think there is a need to fight for the privacy of browsers, not speed. Without a doubt, the developers of Chrome make them cumbersome enough, taking up a lot of memory and so on.

Confidential browsers remain on the fingers of one hand.

And from this point of view - I have no interest in any speed optimizations, as long as the browser is not permeable to telemetry.. In other words, if it's not a Marmaduke - end with Ungoogled. How and why - because not everyone has the knowledge to compile to do it themselves for each new version.

Lastly, I use Chrlauncher, but there is simply no collision of Chromium privacy developments. Only Marmaduke is relied upon, he at least quickly compiled and published them for download, while Elostone stopped publishing new versions of Windows.

Because I'm looking for a sure, protective browser as the Ungoogled is for now. If your developments for new versions were what I'm looking for ... but alas - they are not. ;)

It is a pity that the trend is such - to regress for consumers and withdrawing them to confidentiality, withdrawing freedoms.

@Alex313031
Copy link

Alex313031 commented Oct 15, 2021

Thanks as always!! Also how does one contact Jerry. I made a comment but I don't think he will see or do anything. Things are incorrect. Carl's (who makes Bromite) builds for android need to show that they have all codecs, and marmaduke needs to be reverted from all codecs+ to all codecs, because H.265 hasn't worked for a while now. He was only using build flags, which make the browser REPORT mime handling of HEVC, but won't actually play it, as he has not actually modified the proper header files like Nik used to do. However, doing this on my end failed on linux, I'm not sure if it is windows specific. I'm also going to notify Marmaduke. Also, I'm starting to build often enough that I would like to see if he would consider posting my builds for linux.

Also I cannot for the life of me get cross building to work!! I would really like that so I don't have to boot into windows.
I have compiled on win 7 (yes you can despite them saying you cant by changing some OS version checking stuff and copying some files from a win 10 VS install), 10, and 11, and following the instructions at https://chromium.googlesource.com/chromium/src/+/refs/heads/main/docs/win_cross.md, however after packaging the nonredistributable stuff using 'python package_from_installed.py 2019 -w 10.0.xxxx.xx' it always fails after copying over to linux machine and running 'export DEPOT_TOOLS_WIN_TOOLCHAIN_BASE_URL=<path/to/sdk/zip/file>
export GYP_MSVS_HASH_=' This happens after me trying on all those windows versions, using the newest as well as trying older sdk versions, and this happens on ubunutu bionic and debian buster. So how are YOU cross building successfully.

How are you making your zips. I have been opting to manually copy everything to /home/alex/bin, and running my own build artifact cleanup script, and making my own .desktop file in ~/.local/share/applications, but would like it to auto make it into a zip.

Lastly, where did you find info on downloading the PGO profile, the setup and mini_installer targets, i.e. how does one list all the available targets for ninja/autoninja. These are notably absent from chromium documentation. Also I attached my script and .desktop file (modeled after debian's) for shits and giggles if you or anyone else finds it useful. I also included a .desktop file to run the content shell if you compile blink_tests, as well as a folder (for content shell) and single .svg (for chromium) to copy to /usr/share/icons/hicolor/scalable for both desktop files.
chromium-dev.zip

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 16, 2021

Jerry's email address: https://info.woolyss.com/
Jerry's github: https://github.com/woolyss


On Windows with VS2019, WinSDK 19041, and Python3 installed to default directories. Go to:

/depot_tools/win_toolchain

And run:

python3 package_from_installed.py 2019 --noarm -w 10.0.19041.0

Your choice on ARM support. I do not have it installed in my WinSDK config, as I do not target Windows builds for aarch64.

WinSDK 10.0.22000.0 might work as well. I know the script can package a toolchain from testing earlier this week, but I have not actually tried cross building with 10.0.22000.0, yet.

The script will assemble and spit out of the needed SDK files in a zip archive. For example:

476813973f.zip

Copy that to somewhere on your Linux box. I have mine in /home/robrich/vs2019.

So the first part would be redirecting the Windows toolchain URL to a local directory:

export DEPOT_TOOLS_WIN_TOOLCHAIN_BASE_URL="/home/robrich/vs2019"

Next is overriding the default toolchain hash by setting the hash for the generated archive file.

export GYP_MSVS_HASH_3bda71a11e=476813973f

Having to find and override the hardcoded hash is kind of annoying, but it does not change often. It is currently 3bda71a11e and can be found in the //chromium/build/vs_toolchain.py script file.

Make sure you have target_os = ['android', 'win'] in your //chromium/.gclient config.

Now when you export those values, running gclient sync should pickup the toolchain change.

Alternatively, you can manually do the same from the //chromium/src directory if desired:

python build/vs_toolchain.py update --force

Add target_os = "win" to your builds.args and proceed as usual with your build process.


There are various ways about obtaining PGO data. I just manually run the PGO script to fetch it.

python tools/update_pgo_profiles.py --target=linux update --gs-url-base=chromium-optimization-profiles/pgo_profiles
python tools/update_pgo_profiles.py --target=win64 update --gs-url-base=chromium-optimization-profiles/pgo_profiles
python tools/update_pgo_profiles.py --target=win32 update --gs-url-base=chromium-optimization-profiles/pgo_profiles
python tools/update_pgo_profiles.py --target=mac update --gs-url-base=chromium-optimization-profiles/pgo_profiles

There is no 32-bit PGO data for Linux builds. There might be a 32-bit x86 Linux AFDO sample profile generated by the ChromiumOS project, but I have not looked lately. AFAIK, I have never even done a Chromium 32-bit Linux build.


There are various installer and packaging scripts under the //chromium/src/chrome/installer/ directory.

You might have to skim the scripts to get ideas. For example, here is how to do a deb package when building:

ninja -C out/release "chrome/installer/linux:unstable_deb"

You probably would want to clean up the Google Chrome stuff if doing a deb package. Remove the lines below remove_udev_symlinks in the following files:

//chromium/src/chrome/installer/linux/debian/postinst
//chromium/src/chrome/installer/linux/debian/postrm


You can extract Chromium from the deb if desired. For example, something like this:

ar x chromium-browser-unstable_97.0.4670.0-1_amd64.deb

The data.tar.xz archive has the files you want; already in their recommended directory layout.

BTW, in theory, you might be able to even use alien to convert the deb package to various other package formats. Never tried it here, though.


For a list of what ninja targets, try:

gn args out/release --list > targets.txt

The large list of targets are piped to the text file, though I am not sure installer targets are listed. Been awhile.

@Alex313031
Copy link

Thank you. I know about the different pgo datasets, but where did you find your information, like where is it in the documentation.
I already use gn args out/release --list but that only lists available gn arguments, not all the possible targets.
And I was asking how you make your .zips for linux and windows, like how do you get rid of all the other build artifacts so the zip only has what it needs to run and isn't 4+Gb.
I have tried alien on an ubuntu chromium package and it works well.

On cross building, that is EXACTLY what I've been doing and it still fails, but havent tried 'python build/vs_toolchain.py update --force' so I'll try that. If it still fails ill post the error output here.
Also thanks for making a discussion, tis noice broski.

@Paukan777
Copy link

Paukan777 commented Oct 16, 2021

Answering to #23

You said -avx encodes both AVX and SSE3 in VEX, but I found that -avx2 does same thing. Dump screenshot attached (built with -avx2 -march=skylake). As you see, vmovups used instead of movups while using 128- and 256-bit registers close enough to each other. And so on. I guess clang is smart enough to encode instructions to minimize penalties.

image

There's another dump snippet of same binary, instructions are not VEX encoded. But there's no mixing with ymm registers so there will be no penalty. Compiler decided to use "old-style" encoding.

image

So, don't underestimate compiler, I think there's a reason to use benchmarks to decide if avx2 builds are better.

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 17, 2021

You are overthinking it. Naturally -mavx2 would do the same thing. Why wouldn't it? It includes AVX. Just as if you set -mavx512, and you get AVX and AVX2 as well. All of them enable VEX encoding.

The second screenshot does not include mixing of XMM and YMM instructions, which is what we are discussing here. Throw some AVX code in there, then see what potentially happens. Now think about Chromium. Several components have CPU dispatch, which can cause mixed encodings, as LLVM is building part of the code at -msss3 and the dispatched code at -mavx, -mavx2, etc. LLVM is not going to optimize the -msse3 code for VEX encoding using the default x86-64 processor target.

If you want an AVX2 build of Chromium, do it. ;) Going a little further back, I used to target Haswell and tune for Skylake. I did not want to pull in any instructions from Skylake or later, as I was pushing public builds for larger consumption, but I did want to target my primary Kaby Lake system with instruction tuning.

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 17, 2021

@Alex313031 The Windows zip archives are just reformats and recompresses of the chrome.7z archive generated when building the mini_installer. ;) I started doing the zip archives to appease a few people that complained about installing 7z or similar to use the files.

The only thing I do on Linux is generated debs, which can be unarchived if desired. The ungoogled project apparently generates a portable Linux config, but I have not bothered really looking into their process.


BTW, make sure you are building without any debug symbols. I do not think symbols are pulled into chrome.7z anyway, but it has been a long time since bothering with them, so just covering the bases here.

@RobRich999
Copy link
Owner Author

Before I forget, about documentation, AFAIK it simply does not exist for some features. I have picked up lots of build processes over the years through crbug.com reports. git changes, looking at project buildbots, skimming the source, etc.

@Alex313031
Copy link

Alex313031 commented Oct 17, 2021

@RobRich999 Oh lawd this is half of what I've had to do to learn about building chromium, you'd think PGO and build targets would be documented. Only chrome, chromedriver, blink_tests, and content_shell are well documented. And about the windows archives ah that makes sense, does the installer or build process use its own p7zip or lzma/lib7zip, and yeah duh you don't make linux archives. If you ever do make linux zips check out my cleanup script as I mentioned, it's only meant for linux anyway. For making a portable I have a bash script called cr and a .desktop file and all it really does is this > '--user-data-dir=../.profile --no-default-browser-check --allow-outdated-plugins --disable-logging --disable-breakpad' I use this because I use debians chromium which is sadly I think going to remain stuck on M90 for buster, but mostly use mine, and I want them to run independently of each other with my own builds not touching the system outside of its dir, this is all that is needed on linux to make a portable install.

So is there anything else on build targets, bc idk how you're getting a list of those from gn args out/Default --list.
Also another three flags I would recommend is this. pffft was introduced in M93 but would fail with avx for a long time but as of M97 it works and is better and faster than ffmpeg for webaudio, which is increasingly being used for WebRTC and video chatting.
To learn about what pffft is > https://github.com/marton78/pffft#:~:text=PFFFT%20is%20a%20fork%20of,few%20other%20signal%20processing%20functions
rtc_use_h264 = true
use_webaudio_ffmpeg = false
use_webaudio_pffft = true
First keeps h264 webrtc from bugging out with pffft if the content is H.264_main3.0 or higher, I.E. 1080p or higher., and the 2nd one is needed because unlike most flags which are smart about disabling incompatible ones it wont disable the ffmpeg one when pffft is set and errors during building without it explicitly disabled.

Also, are you using is_official_build, if not (and even if you are because there's not good documentation on what is_official_build actually changes, so I'm not sure if it sets these.) you can set these to further remove debugging stuff (reduction of about 40mb when I ran du -h in my out dir after running cleanup script.)
enable_debugallocation = false
enable_iterator_debugging = false
enable_resource_allowlist_generation = false

A last flags thing is that webui is increasingly being used in chromium and chromium os so this is worthwhile to set but in my testing increases build time by 10 mins (but again im on a FX-8370), probably not an issue for you. . If you wanna know more > https://chromium.googlesource.com/chromium/src/+/HEAD/docs/webui_explainer.md
optimize_webui = true

Did you try out my chromium-dev-editor, it doesn't matter if you didn't but if you did how did it run on win 10 avx. One known bug that I can't seem to figure out is that the menu items wont respond to clicks and you gotta use the arrow keys. I majority of it is programmed in dart and idk much about it. Most everything besides updating the icon, about section, and .crx manifest did just took a bajillion google searches and stackoverflow lol.

Lastly, since you shared your args.gn, I'll share mine, which attempt to make an "as much like google chrome in features and performance" build as possible. Some of the things in there wouldn't be there without your help so cudos.

google_api_key = ""
google_default_client_id = ""
google_default_client_secret = ""
is_official_build = true
is_debug = false
enable_debugallocation = false
enable_iterator_debugging = false
enable_resource_allowlist_generation = false
is_component_build = false
symbol_level = 0
enable_nacl = true
optimize_webui = true
use_lld = true
blink_symbol_level=0
enable_precompiled_headers = false
media_use_ffmpeg = true
media_use_libvpx = true
proprietary_codecs = true
ffmpeg_branding = "Chrome"
enable_ffmpeg_video_decoders = true
is_component_ffmpeg = true
use_webaudio_ffmpeg = false
use_webaudio_pffft = true
use_vaapi = true
use_vr_assets_component = true
enable_widevine = true
bundle_widevine_cdm = false
enable_media_drm_storage = true
enable_media_overlay = true
enable_hangout_services_extension = true
rtc_use_h264 = true
enable_vr = true
enable_platform_hevc = true
enable_platform_ac3_eac3_audio = true
enable_platform_dolby_vision = true
enable_platform_mpeg_h_audio = true
enable_mse_mpeg2ts_stream_parser = true
enable_platform_encrypted_hevc = true
use_thin_lto = true
thin_lto_enable_optimizations = true
chrome_pgo_phase = 2
pgo_data_path = "/home/alex/chromium/src/chrome/build/pgo_profiles/chrome-linux-main-1634428778-a43eab9dd5ad60bddb027ec878634f6b1e31bee3.profdata"

@Alex313031
Copy link

Oh shit I put my api keys!! how do I delete a comment???

@Alex313031
Copy link

Nvm figured it out, and I just edited it.

@RobRich999
Copy link
Owner Author

A step furhter, outright disable PDB creaetion by searching //chromium/src/build/config/compiler/BUILD.gn for config("no_symbols") and below set /DEBUG:NONE for Windows builds.

Just doing a very quick skim right now. ;) I will try to comment on the other inquiries later today or maybe tomorrow.

@Paukan777
Copy link

Paukan777 commented Oct 17, 2021

Several components have CPU dispatch, which can cause mixed encodings, as LLVM is building part of the code at -msss3 and the dispatched code at -mavx, -mavx2, etc. LLVM is not going to optimize the -msse3 code for VEX encoding using the default x86-64 processor target.

Then why not to build ALL the code with -mavx2 not only dispatched to globally avoid SSE3 code?

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 17, 2021

Because building globally with -mavx accomplishes the same in regards to SSE3 code. Plus code dispatch for AVX2, AVX512, etc. still is honored where present in whatever components.

The reality is we are talking a project where the bulk of the core code is actually scalar anyway. ;)

Also, as I have noted elsewhere, AVX2 was my most annoying build to maintain due to breakages. I was not building it natively, which made it all that more "fun"* when something inevitably broke. I might eventually return AVX2 builds. In the meantime, if someone else wants to maintain an AVX2 or whatever build, then I will try to answer build questions as possible.

*Hobbies are supposed to be interesting and fun IMO. When a hobby is no longer either, then it becomes time to reevaluate one's efforts. Admittedly, I have done Chromium builds for so many years now that I am pretty much burnt out on the project; and have been for quite a long time now. It is a big part of why my build intervals have declined in recent times.

@Paukan777
Copy link

Woow, just tried to build current version, gclient runhooks now requires python 3.10, and depot_tools still contains 3.8 ))

@Paukan777
Copy link

*Hobbies are supposed to be interesting and fun IMO. When a hobby is no longer either, then it becomes time to reevaluate one's efforts. Admittedly, I have done Chromium builds for so many years now that I am pretty much burnt out on the project; and have been for quite a long time now. It is a big part of why my build intervals have declined in recent times.

Maybe it's time to publish a complete howto with as much explainations as possible, so people like me and others will able to make their own "it works for me" builds. Maybe someone will then maintain them or will create own repo.

@Alex313031
Copy link

Alex313031 commented Oct 18, 2021

@RobRich999 I'm sorry to hear that mate. Also jerry fixed carls by adding all codecs and changes marmadukes from allcodecs+ to allcodecs at my request, and he has accepted my thorium builds for the linux subsection!! He also accepted my chromium os builds but I need to find a place to put them as I dont think github will allow the 6.8gb apiece disk image files.
Did you decide to use any of the flags I mentioned? And thanks for the PDB thing, I was already using the flag to disable but changing that completely removes it.
@Paukan777 I saw that, however I fixed it by completely clobbering chromium/src, depot_tools AND (this is important) .vpython-root and .vpython_cipd_cache which can be found in your home folder. After starting completely fresh chromium builds like normal and runhooks is fine. .vpython-root stores multiple python binaries that run independently of the system, and sometimes depot_tools doesn't like to runhooks if a newer python is needed because runhooks is needed to install the newer version, stupid I know.

@Alex313031
Copy link

Also @RobRich999 Whats your build system i.e cpu, ram, gpu. os, os version.

@Alex313031
Copy link

Alex313031 commented Oct 18, 2021

@Paukan777 I would be willing to publish an in depth how-to to my github if you want.

@Alex313031
Copy link

Alex313031 commented Oct 18, 2021

I am proud to announce the first actual release of Thorium, which will be put on chromium.woolyss.com right underneath yours @RobRich999 as soon as Jerry reads his email today.
https://github.com/Alex313031/Thorium
Release is in .deb format and is at 97.0.4674.0

Also uploaded is linux-chromeos, where it runs the whole system UI in an X11 window. Jerry said he might add that to the chromiumos subsection.
https://github.com/Alex313031/ChromeOS-Linux

Also uploaded is full ChromiumOS x86_64 builds with x264, module, and full linux firmware support. If someone here wants a tutorial on API keys to allow sign in on ChromiumOS OR to enable sync on chromium (or any other chromium based browser for that matter), I can provide them sparingly upon request.
https://github.com/Alex313031/ChromiumOS

@RobRich999
Copy link
Owner Author

@Alex313031 Congrats! :)

Windows 10 build box
2x Opteron 6378 (32 cores)
64GB ECC DDR3
Crucial 256GB SSD (os drive)
2x Micron 1TB SSDs (enterprise class)
Radeon RX460 graphics

Kubuntu build box
2x Opteron 6380 (32 cores)
64GB ECC DDR3
2TB Seagate SSHD hybrid drive
32GB Intel Optane SSD drive
Geforce 730 GT graphics

I just obtained a new Ryzen 5700u 8c/16t notebook. I might get around to setting up a dev environment on it.

@RobRich999
Copy link
Owner Author

@Alex313031 It does not hurt to set them, but several of your builds args already match their default settings, at least with is_official_build = true enabled. ;) A few examples:

rtc_use_h264 = true
use_webaudio_ffmpeg = false
use_webaudio_pffft = true
optimize_webui = true

I do use is_official_build = true for building. Primarily it turns off component and debug building. It also turns off ThinLTO optimizations (for now) due to binary bloat, but we turn the optimizations back on with the other ThinLTO build args anyway, so no concern there.

@RobRich999
Copy link
Owner Author

@boyedarat Builds updated, including Win64 AVX512.

@lilacluster611
Copy link

My dream is you make AVX512 win64 build with patches from ungoogled chromium.
I'm using generic ungoogled chromium build, I wish I could use one with your optimizations.

@Andarwinux
Copy link

AVX512 ungoogled chromium would be cool, but that means no more live at HEAD.

As an alternative, you might consider the ungoogled chromium built with PGO+LTO by macchrome.

@RobRich999
Copy link
Owner Author

Apologies on the extended build cycle delay here. Tropical weather systems among other things, plus I had to sort out a network issue with my build server.

I am pushing Linux build updates right now, I will try to get Windows builds done later today.

Note H265 software decoding is not enabled due to various changes in ffmpeg. I poked at it for awhile, but I ended up moving on to getting builds done. H265 hardware decoding is working okay in my quick testing of the feature.

Also YMMV on VAAPI hardware encoding assuming that is even a concern for anyone. It is not being detected on my primary Linux notebook at the moment, regardless of the usual source edit and even the CLI options tried (so far).

@RobRich999
Copy link
Owner Author

Various Windows builds updated. Same issue regarding H265 software support. H265 hardware should be okay, but it is not testable for Windows under VM on my Linux notebook, so YMMV.

@RobRich999
Copy link
Owner Author

Linux builds updated.

Windows builds probably tomorrow.

Software H265 is not enabled for now. I could step likely through the various needed source edits and make changes, but I just have not really felt much like bothering with it.

@RobRich999
Copy link
Owner Author

RobRich999 commented Oct 28, 2024

WinAVX and WinAVX2 updated. WinAVX512 will be delayed probably until Tuesday.

@RobRich999
Copy link
Owner Author

WinAVX512 build updated:

https://github.com/RobRich999/Chromium_Clang/releases/tag/v132.0.6806.0-r1375213-win64-avx512

@RobRich999
Copy link
Owner Author

WinAVX and WinAVX2 builds updated. WinAVX512 and Linux builds in a day or few.

@RobRich999
Copy link
Owner Author

WinAVX512 build updated:

https://github.com/RobRich999/Chromium_Clang/releases/tag/v133.0.6836.0-r1382329-win64-avx512

@RobRich999
Copy link
Owner Author

Updated Linux and Windows builds. Linux AMD Zen2 build here:

https://github.com/RobRich999/Chromium_Clang/releases/tag/v133.0.6836.0-r1382377-linux64-deb-avx2-znver2

@RobRich999
Copy link
Owner Author

RobRich999 commented Nov 14, 2024

I have verified hardware video decoding is working (for now :p ) with Wayland on AMDGPU using these CLI options:

--ozone-platform-hint=auto --enable-features=AcceleratedVideoDecodeLinuxZeroCopyGL,AcceleratedVideoDecodeLinuxGL,VaapiIgnoreDriverChecks

Those with Nvidia GPUs can try adding VaapiOnNvidiaGPU to the enable-features list as well.

Tested using Chromium 133.0.6836.0 on CachyOS. AMD 5700u with Vega graphics.

Decode h264 baseline          : 64x64 to 4096x4096 pixels
Decode h264 main              : 64x64 to 4096x4096 pixels
Decode h264 high              : 64x64 to 4096x4096 pixels
Decode vp9 profile0           : 64x64 to 8192x4352 pixels
Decode vp9 profile2           : 64x64 to 8192x4352 pixels
Decode hevc main              : 64x64 to 8192x4352 pixels
Decode hevc main 10           : 64x64 to 8192x4352 pixels
Decode hevc main still-picture: 64x64 to 8192x4352 pixels

@LSX774166810
Copy link

i have problem building old version chromium
how to build old version,very old version...90+.0.xxxx.0 kind of old

@RobRich999
Copy link
Owner Author

RobRich999 commented Dec 3, 2024

@LSX774166810 Been too long ago for me. In theory checking out the git tag should work.

https://stackoverflow.com/questions/47087970/how-to-checkout-and-build-specific-chromium-tag-branch-without-download-the-full

When doing the gclient runhooks command, that should pull the tools for building that version.

You can try posting an inquiry on the chromium-dev group if further help is needed as well.

https://groups.google.com/a/chromium.org/g/chromium-dev

The above said, I would suggest seriously to not use such an old build actively on the net due to potential security concerns.

@RobRich999
Copy link
Owner Author

RobRich999 commented Dec 3, 2024

Windows builds updated. Software H265 restored as well; thanks StaZhu!

@ferrreo
Copy link

ferrreo commented Dec 5, 2024

Is it possible the video stuff could be put into a different patch to the avx2 things? I use a different patchset for video so have to manually edit this one so it doesn't get duplicated (StaZhu's). Additionally any chance BOLT support could get integrated into this (basically adding "-Wl,--emit-relocs" to the lld flags)?

One other thing I have noticed is the latest version of the avx2 linux patch seems to be changing some things inside of isWin checks which does not seem correct for a linux only patch?

Otherwise awesome work on this patchset.

@RobRich999
Copy link
Owner Author

RobRich999 commented Dec 6, 2024

@ferrreo If I felt better these days, yeah, I could maintain multiple patches. For now you will have to manually remove the video edits. Thankfully it is just the matter of removing certain files sections from the patches.

The Linux AVX2 patches have been updated. I had FMA enabled on the wrong OS conditional. I was probably "half out of it" when I tossed those patches together the other day. Thanks for the FYI! :)

About BOLT, last time I tried it, CFI had to be disabled for it to work. I would not mind having the possible performance benefits of BOLT, but not really at the expense of CFI-related security mitigations. Also I seem to have a nagging feeling that I ended up having BOLT ignore certain other functions due to other issues, though that was quite awhile ago, so YMMV.

Propeller would be an alternative option, and I suspect it would work okay with CFI. The "problem" is dealing with more rounds of profiling and building. The same issue I have with doing CSPGO as well.

I did manage to convince a project dev to start work on enabling Temporal PGO, which would work for Linux, MacOS, and similar. I know he was interested mostly to get it working for Chromium on Android to possibly reduce app startup times. I am not really sure as to his planned roadmap.


Ultimately, I used to maintain much finer-grained optimizations with various LLVM passes, tweaks to LLVM limits, Polly, etc. I even used to internally modify LLVM, including even pass features and ordering.

I just do not feel like dealing with the efforts anymore, especially to track down annoying regressions. The current SIMD instruction set modifications are low-hanging fruit, and they rarely regress performance, thus why I keep them going here.

@Andarwinux
Copy link

Maybe consider aligning chromium to 2MB, on a new enough kernel with CONFIG_READ_ONLY_THP_FOR_FS=y it will automatically utilize THP and thus reduce page faults, which gets most of the performance boost without BOLT.

@ferrreo
Copy link

ferrreo commented Dec 6, 2024

Having to disable CFI does suck but I found out a lot of distros don't enable CFI in the first place in the chromium they ship so although not gaining security also not losing in most cases.

@RobRich999
Copy link
Owner Author

@Andarwinux Chromium runs okay with the iodlr shared library for THP remapping if wanting to further test.

https://github.com/intel/iodlr/tree/master/large_page-c

Do not think I ever bothered collecting much benchmark data using Chromium with iodlr, since it does not appear I logged any performance returns in my notes. Otherwise it could be any returns were too insignificant to bother noting at the time. YMMV.

BOLT does more than "hugify" and align, though yeah, those are likely two of the lower hanging fruits on the optimization tree.

@Alex313031
Copy link

@RobRich999 Why did you remove the no-emulated-tls thing? Was it cause of compiler crashes?

c072733#diff-1928a7effd7a863bc9ea99544e60ae5b610afbaafc56c222c969628f757e12b3L19

Also, what did it do anyway?

@Sunspark-007
Copy link

Hi, does the possibility exist that an appimage test build of this could be made to see how it does?

(Preferably one that uses the system's .fonts.conf instead of defining its own.)

Immutable systems like Steam Deck only use appimage or flatpak.

@LordSlon
Copy link

Can you build with google sync support?

@necros2k7
Copy link

@RobRich999 Hello, can you supply benchmark results / metrics on before and after AVX* builds optimization? To see if optimizations are actually working, or what`s the fastest way for me to measure it myself?

@enginelesscc
Copy link

Would you mind having the gpu-blocklist disabled by default? (ignore-gpu-blocklist)

Especially windows VMs will greatly benefit from this.
msft's d3dwarp sw-renderer is alot (!) faster than swiftshader.

Ideally, forcefully returning false here makes more sense instead:
https://source.chromium.org/chromium/chromium/src/+/main:gpu/config/gpu_info.cc;l=201

@Alex313031 that's also something you may wanna do in your builds^

@enginelesscc
Copy link

Oh also an update on my scores with the avx512 build:

Incredible, massive boost and difference. Same hardware as before (7950X + 4090, unclean windows install)

2.1:

image

3.0:

image

@necros2k7
Copy link

@enginelesscc where are "before" screens to compare?

@enginelesscc
Copy link

@enginelesscc where are "before" screens to compare?

#26 (comment)

@RobRich999
Copy link
Owner Author

RobRich999 commented Dec 25, 2024

Alex313031 The values present are compiler defaults anyway. Sometimes I experiment with things and end up leaving cruft in the diffs. ;) It deals with thread local storage support of thread-local variables versus normal variables. Many platforms support the process in hardware, so there is no need for emulated TLS.


Sunspark-007 I do good to keep these builds occasionally updated these days, so apologies, but I am not targeting different builds at the moment. You might inquire with Alex313031 to see if he is interested.

Otherwise if you feel really adventurous and want to sort through any needed dependencies, there is a way to generate an AppImage from a deb package.

https://docs.appimage.org/packaging-guide/converting-binary-packages/pkg2appimage.html


LordSlon Sync is actually present, it is just I do not supple the Google API keys needed for it to work. You can generate your own Google API keys if desired, then set them as environment variables. Woolyss has more info:

https://chromium.woolyss.com/#google-api-keys


@necros2k7 https://www.browserbench.org/ is a good starting point. I used to do routine testing, though it largely became redundant, as upping the SIMD support levels and processor scheduling models tended to bring at least modest uplifts in previous years of testing. Note an YMMV case depending upon workload would be something like AMD Jaguar (if not explicitly target by -march or -mtune), which supports up to 256-bit AVX but only has 128-bit wide FP/SIMD units. So 256 ops execute as 2x128 ops.


enginelesscc Thanks for the performance testing. :) I might consider the software renderer after some further testing. Might take awhile, as I am not much a Windows user. Either way, passing the CLI option (--ignore-gpu-blocklist) is easy enough for most to configure for now.


I will be pushing updated Linux builds in a few minutes. Windows builds might be later today or tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests