粗略判断Shader每条代码的成本

GPU IS a processor (graphics proccessing unit). Anywho, i remember seeing somewhere that in geforce 6 series cards its a signle cycle (maybe i was just dreaming :-p) but i have that memory

radeon x800 has it anyways
EDIT:

Quote:

ORIGINALLY AT: http://gear.ibuypower.com/GVE/Store/ProductDetails.aspx?sku=VC-POWERC-147
Smartshader HD•Support for Microsoft® DirectX® 9.0 programmable vertex and pixel shaders in hardware
• DirectX 9.0 Vertex Shaders
- Vertex programs up to 65,280 instructions with flow control
- Single cycle trigonometric operations (SIN & COS)
• Direct X 9.0 Extended Pixel Shaders
- Up to 1,536 instructions and 16 textures per rendering pass
- 32 temporary and constant registers
- Facing register for two-sided lighting
- 128-bit, 64-bit & 32-bit per pixel floating point color formats
- Multiple Render Target (MRT) support
• Complete feature set also supported in OpenGL® via extensions

继续阅读粗略判断Shader每条代码的成本

Android Gradle Plugin源码解析之externalNativeBuild

在Android Studio 2.2开始的Android Gradle Plugin版本中,Google集成了对cmake的完美支持,而原先的ndkBuild的方式支持也变得更加良好。这篇文章就来说说Android Gradle Plugin与交叉编译之间的一些事,即externalNativeBuild相关的task,主要是解读一下gradle构建系统相关的源码。

继续阅读Android Gradle Plugin源码解析之externalNativeBuild

Overriding a default option(…) value in CMake from a parent CMakeLists.txt

CMakeLists.txt

CMakeLists.txt

执行如下命令的时候:

会观察到生成的配置文件中 BUILD_FOR_ANDROID 不一定能生效。

需要如下配置才行:
CMakeLists.txt

参考链接


Use ccache with CMake for faster compilation

C and C++ compilers aren’t the fastest pieces of software out there and there’s no lack of programmer jokes based on tedium of waiting for their work to complete.

There are ways to fix the pain though - one of them is ccache. CCache improves compilation times by caching previously built object files in private cache and reusing them when you’re recompiling same objects with same parameters. Obviously it will not help if you’re compiling the code for the first time and it also won’t help if you often change compilation flags. Most C/C++ development however involves recompiling same object files with the same parameters and ccache helps alot.

For illustration, here’s the comparison of first and subsequent compilation times of a largish C++ project:

Original run with empty cache:

Recompilation with warm cache:

Installation

CCache is available in repositories on pretty much all distributions. On OS X use homebrew:

and on Debian-based distros use apt:

CMake configuration

After ccache is installed, you need to tell CMake to use it as a wrapper for the compiler. Add these lines to your CMakeLists.txt:

Rerun cmake and next make should use ccache for wrapper.

Usage with Android NDK

CCache can even be used on Android NDK - you just need to export NDK_CCACHE environment variable with path to ccache binary. ndk-build script will automatically use it. E.g.

(Note that on Debian/Ubuntu the path will probably be /usr/bin/ccache)

CCache statistics

To see if ccache is really working, you can use ccache -s command, which will display ccache statistics:

On second and all subsequent compilations the “cache hit” values should increase and thus show that ccache is working.

参考链接


Use ccache with CMake for faster compilation

macOS Mojave(10.14.3)系统QEMU虚拟机运行Clockwork OS


编译 u-boot

编译 Linux 内核


手工编译 qemu

可惜到这一步了,还是没办法成功运行系统。

参考链接


Using QEMU to emulate a Raspberry Pi

If you're building software for the Raspberry Pi (like I sometimes do), it can be a pain to have to constantly keep Pi hardware around and spotting Pi-specific problems can be difficult until too late.

One option (and the one I most like) is to emulate a Raspberry Pi locally before ever hitting the device. Why?

  • Works anywhere you can install QEMU
  • No hardware setup needed (no more scratching around for a power supply)
  • Faster feedback cycle compared to hardware
  • I can use Pi software (like Raspbian) in a virtual context
  • I can prep my "virtual Pi" with all the tools I need regardless of my physical Pi's use case

Given I'm next-to-useless at Python, that last one is pretty important as it allows me to install every Python debugging and testing tool known to man on my virtual Pi while my end-product hardware stays comparatively pristine.

Getting started

First, you'll need a few prerequisites:

QEMU (more specifically qemu-system-arm)

You can find all the packages for your chosen platform on the QEMU website and is installable across Linux, macOS and even Windows.

Raspbian

Simply download the copy of Raspbian you need from the official site. Personally, I used the 2018-11-13 version of Raspbian Lite, since I don't need an X server.

Kernel

Since the standard RPi kernel can't be booted out of the box on QEMU, we'll need a custom kernel. We'll cover that in the next step.

Preparing

Get your kernel

First, you'll need to download a kernel. Personally, I (along with most people) use the dhruvvyas90/qemu-rpi-kernel repository's kernels. Either clone the repo:

or download a kernel directly:

or download a snapshot from my website directly:

For the rest of these steps I'm going to be using the kernel-qemu-4.4.34-jessiekernel, so update the commands as needed if you're using another version.

Filesystem image

This step is optional, but recommended

When you download the Raspbian image it will be in the raw format, a plain disk image (generally with an .img extension).

A more efficient option is to convert this to a qcow2 image first. Use the qemu-imgcommand to do this:

Now we can also easily expand the image:

You can check on your image using the qemu-img info command

Starting

You've got everything you need now: a kernel, a disk image, and QEMU!

Actually running the virtual Pi is done using the qemu-system-arm command and it can be quite complicated. The full command is this (don't worry it's explained below):

如果需要指定上网方式的话,执行如下命令:

So, in order:

  • sudo qemu-system-arm: you need to run QEMU as root
  • -kernel: this is the path to the QEMU kernel we downloaded in the previous step
  • -append: here we are providing the boot args direct to the kernel, telling it where to find it's root filesytem and what type it is
  • -hda: here we're attaching the disk image itself
  • -cpu/-m: this sets the CPU type and RAM limit to match a Raspberry Pi
  • -M: this sets the machine we are emulating. versatilepb is the 'ARM Versatile/PB' machine
  • -no-reboot: just tells QEMU to exit rather than rebooting the machine
  • -serial: redirects the machine's virtual serial port to our host's stdio
  • -net: this configures the machine's network stack to attach a NIC, use the user-mode stack, connect the host's vnet0 TAP device to the new NIC and don't use config scripts.

If it's all gone well, you should now have a QEMU window pop up and you should see the familiar Raspberry Pi boot screen show up.

Now, go get yourself a drink to celebrate, because it might take a little while.

Networking

Now, that's all well and good, but without networking, we may as well be back on hardware. When the machine started, it will have attached a NIC and connected it to the host's vnet0 TAP device. If we configure that device with an IP and add it to a bridge on our host, you should be able to reliably access it like any other virtual machine.

(on host) Find a bridge and address

This will vary by host, but on my Fedora machine, for example, there is a pre-configured virbr0 bridge interface with an address in the 192.168.122.0/24 space:

I'm going to use this bridge and just pick a static address for my Pi: 192.168.122.200

Reusing an existing (pre-configured) bridge means you won't need to sort your own routing

(in guest) Configure interface

NOTE: I'm assuming Stretch here.

Open /etc/dhcpcd.conf in your new virtual Pi and configure the eth0 interface with a static address in your bridge's subnet. For example, for my bridge:

You may need to reboot for this to take effect

(in host) Add TAP to bridge

Finally, add the machine's TAP interface to your chosen bridge with the brctl command:

Now, on your host, you should be able to ping 192.168.122.200 (or your Pi's address).

Set up SSH

Now, in your machine, you can run sudo raspi-config and enable the SSH server (in the "Interfacing Options" menu at time of writing).

Make sure you change the password from default while you're there!

Finally, on your host, run ssh-copy-id pi@192.168.122.200 to copy your SSH key into the Pi's pi user and you can now SSH directly into your Pi without a password prompt.

参考链接


Using QEMU to emulate a Raspberry Pi

Simple ARM NEON optimized sin, cos, log and exp

This is the sequel of the single precision SSE optimized sin, cos, log and exp that I wrote some time ago. Adapted to the NEON fpu of my pandaboard. Precision and range are exactly the same than the SSE version, so I won't repeat them.

The code

The functions below are licensed under the zlib license, so you can do basically what you want with them.

  • neon_mathfun.h source code for sin_ps, cos_ps, sincos_ps, exp_ps, log_ps, as straight C.
  • neon_mathfun_test.c Validation+Bench program for those function. Do not forget to run it once.

Performance

Results on a pandaboard with a 1GHz dual-core ARM Cortex A9 (OMAP4), using gcc 4.6.1

command line: gcc -O3 -mfloat-abi=softfp -mfpu=neon -march=armv7-a -mtune=cortex-a9 -Wall -W neon_mathfun_test.c -lm

So performance is not stellar. I recommend to use gcc 4.6.1 or newer as it generates much better code than previous (gcc 4.5) versions -- almost 20% faster here. I believe rewriting these functions in assembly would improve the performance by 30%, and should not be very hard as the ARM and NEON asm is quite nice and easy to write -- maybe I'll do it. Computing two SIMD vectors at once would also help to improve a lot the performance as there are enough registers on NEON, and it would reduce the dependancies between neon instructions.

Note also that I have no idea of the performance on a Cortex A8 -- it may be extremely bad, I don't know.

Comparison with an Intel Atom

For comparison purposes, here is the performance of the SSE version on a single core Intel Atom N270 running at 1.66GHz

command line: cl.exe /arch:SSE /O2 /TP /MD sse_mathfun_test.c (this is msvc 2010)

The number of cycles is quite similar -- but the atom has a higher clock..

Last modified: 2011/05/29

参考链接


Simple ARM NEON optimized sin, cos, log and exp

使用Git Hooks(Pre-Commit)实现代码提交的时候自动格式化代码

钩子文件在项目目录下

git 的钩子放在 git 项目下的 .git/hooks 目录。

如果我们所有项目都需要一个通用的钩子,那么我们需要在所有的项目中都放置钩子文件。挨个复制显然不是一个可行的方案。

模板目录

我们可用模板目录来解决这个问题。

在 git init 或者 git clone时,如果指定有模板目录,会使用拷贝模板目录下的文件到 .git/ 目录下。

好了,那么解决方案就是:把统一的钩子文件放到模板目录,然后在 git init / git clone 时候指定模板目录?

不行,这样还是太麻烦了。

模板目录写入全局配置

模板目录固定在一个地方,我们可以把模板目录写入全局配置。

在 git init 或者 git clone 时,会自动拷贝钩子文件到项目的钩子目录。 已有项目,执行 git init 重新初始化项目即可。

代码提交时候,自动格式化的参考代码如下:

使用的时候,简单的拷贝到.git/hooks 目录下,并重新命名为pre-commit。然后执行:

这样,每次提交代码的时候,都会自动格式化代码了。

参考链接


Git Hooks

Git 钩子

和其它版本控制系统一样,Git 能在特定的重要动作发生时触发自定义脚本。 有两组这样的钩子:客户端的和服务器端的。 客户端钩子由诸如提交和合并这样的操作所调用,而服务器端钩子作用于诸如接收被推送的提交这样的联网操作。 你可以随心所欲地运用这些钩子。

安装一个钩子

钩子都被存储在 Git 目录下的 hooks 子目录中。 也即绝大部分项目中的 .git/hooks 。 当你用 git init 初始化一个新版本库时,Git 默认会在这个目录中放置一些示例脚本。这些脚本除了本身可以被调用外,它们还透露了被触发时所传入的参数。 所有的示例都是 shell 脚本,其中一些还混杂了 Perl 代码,不过,任何正确命名的可执行脚本都可以正常使用 —— 你可以用 Ruby 或 Python,或其它语言编写它们。 这些示例的名字都是以 .sample 结尾,如果你想启用它们,得先移除这个后缀。

把一个正确命名且可执行的文件放入 Git 目录下的 hooks 子目录中,即可激活该钩子脚本。 这样一来,它就能被 Git 调用。 接下来,我们会讲解常用的钩子脚本类型。

客户端钩子

客户端钩子分为很多种。 下面把它们分为:提交工作流钩子、电子邮件工作流钩子和其它钩子。

Note
需要注意的是,克隆某个版本库时,它的客户端钩子 并不 随同复制。 如果需要靠这些脚本来强制维持某种策略,建议你在服务器端实现这一功能。(请参照 使用强制策略的一个例子 中的例子。)
提交工作流钩子

前四个钩子涉及提交的过程。

pre-commit 钩子在键入提交信息前运行。 它用于检查即将提交的快照,例如,检查是否有所遗漏,确保测试运行,以及核查代码。 如果该钩子以非零值退出,Git 将放弃此次提交,不过你可以用 git commit --no-verify 来绕过这个环节。 你可以利用该钩子,来检查代码风格是否一致(运行类似 lint 的程序)、尾随空白字符是否存在(自带的钩子就是这么做的),或新方法的文档是否适当。

prepare-commit-msg 钩子在启动提交信息编辑器之前,默认信息被创建之后运行。 它允许你编辑提交者所看到的默认信息。 该钩子接收一些选项:存有当前提交信息的文件的路径、提交类型和修补提交的提交的 SHA-1 校验。 它对一般的提交来说并没有什么用;然而对那些会自动产生默认信息的提交,如提交信息模板、合并提交、压缩提交和修订提交等非常实用。 你可以结合提交模板来使用它,动态地插入信息。

commit-msg 钩子接收一个参数,此参数即上文提到的,存有当前提交信息的临时文件的路径。 如果该钩子脚本以非零值退出,Git 将放弃提交,因此,可以用来在提交通过前验证项目状态或提交信息。 在本章的最后一节,我们将展示如何使用该钩子来核对提交信息是否遵循指定的模板。

post-commit 钩子在整个提交过程完成后运行。 它不接收任何参数,但你可以很容易地通过运行 git log -1 HEAD 来获得最后一次的提交信息。 该钩子一般用于通知之类的事情。

电子邮件工作流钩子

你可以给电子邮件工作流设置三个客户端钩子。 它们都是由 git am 命令调用的,因此如果你没有在你的工作流中用到这个命令,可以跳到下一节。 如果你需要通过电子邮件接收由 git format-patch 产生的补丁,这些钩子也许用得上。

第一个运行的钩子是 applypatch-msg 。 它接收单个参数:包含请求合并信息的临时文件的名字。 如果脚本返回非零值,Git 将放弃该补丁。 你可以用该脚本来确保提交信息符合格式,或直接用脚本修正格式错误。

下一个在 git am 运行期间被调用的是 pre-applypatch 。 有些难以理解的是,它正好运行于应用补丁 之后,产生提交之前,所以你可以用它在提交前检查快照。 你可以用这个脚本运行测试或检查工作区。 如果有什么遗漏,或测试未能通过,脚本会以非零值退出,中断 git am 的运行,这样补丁就不会被提交。

post-applypatch 运行于提交产生之后,是在 git am 运行期间最后被调用的钩子。 你可以用它把结果通知给一个小组或所拉取的补丁的作者。 但你没办法用它停止打补丁的过程。

其它客户端钩子

pre-rebase 钩子运行于变基之前,以非零值退出可以中止变基的过程。 你可以使用这个钩子来禁止对已经推送的提交变基。 Git 自带的 pre-rebase 钩子示例就是这么做的,不过它所做的一些假设可能与你的工作流程不匹配。

post-rewrite 钩子被那些会替换提交记录的命令调用,比如 git commit --amend 和 git rebase(不过不包括 git filter-branch)。 它唯一的参数是触发重写的命令名,同时从标准输入中接受一系列重写的提交记录。 这个钩子的用途很大程度上跟 post-checkout 和 post-merge 差不多。

在 git checkout 成功运行后,post-checkout 钩子会被调用。你可以根据你的项目环境用它调整你的工作目录。 其中包括放入大的二进制文件、自动生成文档或进行其他类似这样的操作。

在 git merge 成功运行后,post-merge 钩子会被调用。 你可以用它恢复 Git 无法跟踪的工作区数据,比如权限数据。 这个钩子也可以用来验证某些在 Git 控制之外的文件是否存在,这样你就能在工作区改变时,把这些文件复制进来。

pre-push 钩子会在 git push 运行期间, 更新了远程引用但尚未传送对象时被调用。 它接受远程分支的名字和位置作为参数,同时从标准输入中读取一系列待更新的引用。 你可以在推送开始之前,用它验证对引用的更新操作(一个非零的退出码将终止推送过程)。

Git 的一些日常操作在运行时,偶尔会调用 git gc --auto 进行垃圾回收。 pre-auto-gc 钩子会在垃圾回收开始之前被调用,可以用它来提醒你现在要回收垃圾了,或者依情形判断是否要中断回收。

服务器端钩子

除了客户端钩子,作为系统管理员,你还可以使用若干服务器端的钩子对项目强制执行各种类型的策略。 这些钩子脚本在推送到服务器之前和之后运行。 推送到服务器前运行的钩子可以在任何时候以非零值退出,拒绝推送并给客户端返回错误消息,还可以依你所想设置足够复杂的推送策略。

pre-receive

处理来自客户端的推送操作时,最先被调用的脚本是 pre-receive。 它从标准输入获取一系列被推送的引用。如果它以非零值退出,所有的推送内容都不会被接受。 你可以用这个钩子阻止对引用进行非快进(non-fast-forward)的更新,或者对该推送所修改的所有引用和文件进行访问控制。

update

update 脚本和 pre-receive 脚本十分类似,不同之处在于它会为每一个准备更新的分支各运行一次。 假如推送者同时向多个分支推送内容,pre-receive 只运行一次,相比之下 update 则会为每一个被推送的分支各运行一次。 它不会从标准输入读取内容,而是接受三个参数:引用的名字(分支),推送前的引用指向的内容的 SHA-1 值,以及用户准备推送的内容的 SHA-1 值。 如果 update 脚本以非零值退出,只有相应的那一个引用会被拒绝;其余的依然会被更新。

post-receive

post-receive 挂钩在整个过程完结以后运行,可以用来更新其他系统服务或者通知用户。 它接受与 pre-receive 相同的标准输入数据。 它的用途包括给某个邮件列表发信,通知持续集成(continous integration)的服务器,或者更新问题追踪系统(ticket-tracking system) —— 甚至可以通过分析提交信息来决定某个问题(ticket)是否应该被开启,修改或者关闭。 该脚本无法终止推送进程,不过客户端在它结束运行之前将保持连接状态,所以如果你想做其他操作需谨慎使用它,因为它将耗费你很长的一段时间。

参考链接


macOS Mojave(10.14.3)安装qemu

Install Homebrew:

Install qemu:

参考链接