macOS Mojave(10.14.4)系统Octave 5.1.0使用pause()函数无法响应按键事件

目前 ( 2019/04/24 ),在 `macOS Mojave` (`10.14.4`)系统上使用 `brew install octave` ,安装 `Octave 5.1.0` 之后,使用 `pause()` 函数无法在点击键盘之后继续执行,除了 `Ctrl + C` 之外任意键都不响应。正常情况下,点击任意按键之后,应该继续执行后续的代码。

这个是目前使用 `brew` 安装的 `Octave 5.1.0` 在编译的时候,关联的库是 `glibc 2.28` 之后的版本。这个版本上 `glibc 2.28` 的某些行为发生变动。具体的讨论信息,参考 bug #55029: pause() with no arguments does not return like kbhit() with glibc 2.28 上的讨论。本质就是 `glibc 2.28` 之后的版本要求应用程序在接收信息结束( `EOF` )之后,主动调用 `clearerr (stdin);` ,否则会收不到后续的按键通知。这个 `BUG` 在 `Octave 5.2` 版本被修复,但是这个版本何时发布,暂时不定。

目前的修复方式为要求 `brew` 从最新版本的代码编译安装,而不是安装已发布版本,如下:

$ brew uninstall --ignore-dependencies octave

# 安装编译依赖
$ brew install texinfo

$ wget https://raw.githubusercontent.com/Homebrew/homebrew-core/master/Formula/octave.rb

$ sed -i "" "s/\"--enable-shared\"/\"--enable-shared\",\"--disable-docs\"/g" octave.rb

$ brew install --build-from-source --HEAD -v octave.rb

修改下载的编译配置文件,并且关闭文档编译( 目前文档编译会失败),也就是增加  `--disable-docs` 这个编译参数。

调整之后的编译脚本如下:

class Octave < Formula
  desc "High-level interpreted language for numerical computing"
  homepage "https://www.gnu.org/software/octave/index.html"
  url "https://ftp.gnu.org/gnu/octave/octave-5.1.0.tar.xz"
  mirror "https://ftpmirror.gnu.org/octave/octave-5.1.0.tar.xz"
  sha256 "87b4df6dfa28b1f8028f69659f7a1cabd50adfb81e1e02212ff22c863a29454e"
  revision 2

  bottle do
    sha256 "6bb8497839d6f7872efcd6acad0216f443420e097a9b7fad44835823e1c0e735" => :mojave
    sha256 "d1de53a30f002d8b7ec3a6065994c46d8cbd4830aa7e199f572baff48723c6e6" => :high_sierra
    sha256 "7a648cff129ec85a5ee9417a0339a3b804756f7958585b707c015d322d220b15" => :sierra
  end

  head do
    url "https://hg.savannah.gnu.org/hgweb/octave", :branch => "default", :using => :hg

    depends_on "autoconf" => :build
    depends_on "automake" => :build
    depends_on "bison" => :build
    depends_on "icoutils" => :build
    depends_on "librsvg" => :build
  end

  # Complete list of dependencies at https://wiki.octave.org/Building
  depends_on "gnu-sed" => :build # https://lists.gnu.org/archive/html/octave-maintainers/2016-09/msg00193.html
  depends_on :java => ["1.6+", :build]
  depends_on "pkg-config" => :build
  depends_on "arpack"
  depends_on "epstool"
  depends_on "fftw"
  depends_on "fig2dev"
  depends_on "fltk"
  depends_on "fontconfig"
  depends_on "freetype"
  depends_on "gcc" # for gfortran
  depends_on "ghostscript"
  depends_on "gl2ps"
  depends_on "glpk"
  depends_on "gnuplot"
  depends_on "graphicsmagick"
  depends_on "hdf5"
  depends_on "libsndfile"
  depends_on "libtool"
  depends_on "pcre"
  depends_on "portaudio"
  depends_on "pstoedit"
  depends_on "qhull"
  depends_on "qrupdate"
  depends_on "qt"
  depends_on "readline"
  depends_on "suite-sparse"
  depends_on "sundials"
  depends_on "texinfo"
  depends_on "veclibfort"

  # Dependencies use Fortran, leading to spurious messages about GCC
  cxxstdlib_check :skip

  def install
    # Default configuration passes all linker flags to mkoctfile, to be
    # inserted into every oct/mex build. This is unnecessary and can cause
    # cause linking problems.
    inreplace "src/mkoctfile.in.cc",
              /%OCTAVE_CONF_OCT(AVE)?_LINK_(DEPS|OPTS)%/,
              '""'

    # Qt 5.12 compatibility
    # https://savannah.gnu.org/bugs/?55187
    ENV["QCOLLECTIONGENERATOR"] = "qhelpgenerator"
    # These "shouldn't" be necessary, but the build breaks without them.
    # https://savannah.gnu.org/bugs/?55883
    ENV["QT_CPPFLAGS"]="-I#{Formula["qt"].opt_include}"
    ENV.append "CPPFLAGS", "-I#{Formula["qt"].opt_include}"
    ENV["QT_LDFLAGS"]="-F#{Formula["qt"].opt_lib}"
    ENV.append "LDFLAGS", "-F#{Formula["qt"].opt_lib}"

    system "./bootstrap" if build.head?
    system "./configure", "--prefix=#{prefix}",
                          "--disable-dependency-tracking",
                          "--disable-silent-rules",
                          "--enable-link-all-dependencies",
                          "--enable-shared","--disable-docs",
                          "--disable-static",
                          "--with-hdf5-includedir=#{Formula["hdf5"].opt_include}",
                          "--with-hdf5-libdir=#{Formula["hdf5"].opt_lib}",
                          "--with-x=no",
                          "--with-blas=-L#{Formula["veclibfort"].opt_lib} -lvecLibFort",
                          "--with-portaudio",
                          "--with-sndfile"
    system "make", "all"

    # Avoid revision bumps whenever fftw's or gcc's Cellar paths change
    inreplace "src/mkoctfile.cc" do |s|
      s.gsub! Formula["fftw"].prefix.realpath, Formula["fftw"].opt_prefix
      s.gsub! Formula["gcc"].prefix.realpath, Formula["gcc"].opt_prefix
    end

    # Make sure that Octave uses the modern texinfo at run time
    rcfile = buildpath/"scripts/startup/site-rcfile"
    rcfile.append_lines "makeinfo_program(\"#{Formula["texinfo"].opt_bin}/makeinfo\");"

    system "make", "install"
  end

  test do
    system bin/"octave", "--eval", "(22/7 - pi)/pi"
    # This is supposed to crash octave if there is a problem with veclibfort
    system bin/"octave", "--eval", "single ([1+i 2+i 3+i]) * single ([ 4+i ; 5+i ; 6+i])"
  end
end

参考链接


Simple ARM NEON optimized sin, cos, log and exp

This is the sequel of the single precision SSE optimized sin, cos, log and exp that I wrote some time ago. Adapted to the NEON fpu of my pandaboard. Precision and range are exactly the same than the SSE version, so I won't repeat them.

The code

The functions below are licensed under the zlib license, so you can do basically what you want with them.

  • neon_mathfun.h source code for sin_ps, cos_ps, sincos_ps, exp_ps, log_ps, as straight C.
  • neon_mathfun_test.c Validation+Bench program for those function. Do not forget to run it once.

Performance

Results on a pandaboard with a 1GHz dual-core ARM Cortex A9 (OMAP4), using gcc 4.6.1

command line: gcc -O3 -mfloat-abi=softfp -mfpu=neon -march=armv7-a -mtune=cortex-a9 -Wall -W neon_mathfun_test.c -lm

exp([        -1000,          -100,           100,          1000]) = [            0,             0, 2.4061436e+38, 2.4061436e+38]
exp([         -nan,           inf,          -inf,           nan]) = [          nan, 2.4061436e+38,             0,           nan]
log([            0,           -10,         1e+30, 1.0005271e-42]) = [         -nan,          -nan,     69.077553,          -nan]
log([         -nan,           inf,          -inf,           nan]) = [    89.128304,     88.722839,          -nan,     89.128304]
sin([         -nan,           inf,          -inf,           nan]) = [          nan,           nan,          -nan,           nan]
cos([         -nan,           inf,          -inf,           nan]) = [          nan,           nan,           nan,           nan]
sin([       -1e+30,       -100000,         1e+30,        100000]) = [          inf,  -0.035749275,          -inf,   0.035749275]
cos([       -1e+30,       -100000,         1e+30,        100000]) = [          nan,    -0.9993608,           nan,    -0.9993608]
benching                 sinf .. ->    2.0 millions of vector evaluations/second -> 121 cycles/value on a 1000MHz computer
benching                 cosf .. ->    1.8 millions of vector evaluations/second -> 132 cycles/value on a 1000MHz computer
benching                 expf .. ->    1.1 millions of vector evaluations/second -> 221 cycles/value on a 1000MHz computer
benching                 logf .. ->    1.7 millions of vector evaluations/second -> 141 cycles/value on a 1000MHz computer
benching          cephes_sinf .. ->    2.4 millions of vector evaluations/second -> 103 cycles/value on a 1000MHz computer
benching          cephes_cosf .. ->    2.0 millions of vector evaluations/second -> 123 cycles/value on a 1000MHz computer
benching          cephes_expf .. ->    1.6 millions of vector evaluations/second -> 153 cycles/value on a 1000MHz computer
benching          cephes_logf .. ->    1.5 millions of vector evaluations/second -> 156 cycles/value on a 1000MHz computer
benching               sin_ps .. ->    5.8 millions of vector evaluations/second ->  43 cycles/value on a 1000MHz computer
benching               cos_ps .. ->    5.9 millions of vector evaluations/second ->  42 cycles/value on a 1000MHz computer
benching            sincos_ps .. ->    6.0 millions of vector evaluations/second ->  41 cycles/value on a 1000MHz computer
benching               exp_ps .. ->    5.6 millions of vector evaluations/second ->  44 cycles/value on a 1000MHz computer
benching               log_ps .. ->    5.3 millions of vector evaluations/second ->  47 cycles/value on a 1000MHz computer

So performance is not stellar. I recommend to use gcc 4.6.1 or newer as it generates much better code than previous (gcc 4.5) versions -- almost 20% faster here. I believe rewriting these functions in assembly would improve the performance by 30%, and should not be very hard as the ARM and NEON asm is quite nice and easy to write -- maybe I'll do it. Computing two SIMD vectors at once would also help to improve a lot the performance as there are enough registers on NEON, and it would reduce the dependancies between neon instructions.

Note also that I have no idea of the performance on a Cortex A8 -- it may be extremely bad, I don't know.

Comparison with an Intel Atom

For comparison purposes, here is the performance of the SSE version on a single core Intel Atom N270 running at 1.66GHz

command line: cl.exe /arch:SSE /O2 /TP /MD sse_mathfun_test.c (this is msvc 2010)

benching                 sinf .. ->    1.3 millions of vector evaluations/second -> 303 cycles/value on a 1600MHz computer
benching                 cosf .. ->    1.3 millions of vector evaluations/second -> 305 cycles/value on a 1600MHz computer
benching         sincos (x87) .. ->    1.2 millions of vector evaluations/second -> 314 cycles/value on a 1600MHz computer
benching                 expf .. ->    1.6 millions of vector evaluations/second -> 244 cycles/value on a 1600MHz computer
benching                 logf .. ->    1.4 millions of vector evaluations/second -> 276 cycles/value on a 1600MHz computer
benching          cephes_sinf .. ->    1.4 millions of vector evaluations/second -> 280 cycles/value on a 1600MHz computer
benching          cephes_cosf .. ->    1.5 millions of vector evaluations/second -> 265 cycles/value on a 1600MHz computer
benching          cephes_expf .. ->    0.7 millions of vector evaluations/second -> 548 cycles/value on a 1600MHz computer
benching          cephes_logf .. ->    0.8 millions of vector evaluations/second -> 489 cycles/value on a 1600MHz computer
benching               sin_ps .. ->    9.2 millions of vector evaluations/second ->  43 cycles/value on a 1600MHz computer
benching               cos_ps .. ->    9.5 millions of vector evaluations/second ->  42 cycles/value on a 1600MHz computer
benching            sincos_ps .. ->    8.8 millions of vector evaluations/second ->  45 cycles/value on a 1600MHz computer
benching               exp_ps .. ->    9.8 millions of vector evaluations/second ->  41 cycles/value on a 1600MHz computer
benching               log_ps .. ->    8.6 millions of vector evaluations/second ->  46 cycles/value on a 1600MHz computer

The number of cycles is quite similar -- but the atom has a higher clock..

Last modified: 2011/05/29

参考链接


Simple ARM NEON optimized sin, cos, log and exp

Matlab调用C程序

有时需要用Matlab调试某些C语言开发的函数库,需要在Matlab里面查看执行效果。

整个的参考例子如下:

#include <mex.h>

// Check if some command is really some givent one
static bool commandIs(const mxArray* mxCommand, const char* command)
{
    double result;
    mxArray* plhs1[1];
    mxArray* prhs1[1];
    mxArray* plhs2[1];  
    mxArray* prhs2[2];

    if (mxCommand == NULL) { mexErrMsgTxt("'mxCommand' is null"); return false; }
    if (command == NULL) { mexErrMsgTxt("'command' is null"); return false; }
    if (!mxIsChar(mxCommand)) { mexErrMsgTxt("'mxCommand' is not a string"); return false; }

    // First trim
    prhs1[0] = (mxArray*)mxCommand;
    mexCallMATLAB(1, plhs1, 1, prhs1, "strtrim");

    // Then compare
    prhs2[0] = mxCreateString(command);
    prhs2[1] = plhs1[0];
    mexCallMATLAB(1, plhs2, 2, prhs2, "strcmpi");

    // Return comparison result
    result = mxGetScalar(plhs2[0]);  
    return (result != 0.0);
}

static void processHelpMessageCommand(void)
{
    mexPrintf("DspMgr('init') init return Handle,return nil if failed. use 'release' free memory\n"); 
    mexPrintf("DspMgr('release',handle) free memory\n");     
}

static void processInitCommand(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{        
    char* example_buffer = malloc(512);
    plhs[0] = mxCreateNumericMatrix(1,1,mxUINT64_CLASS,mxREAL);
    long long *ip = (long long *) mxGetData(plhs[0]);
    *ip = (long long)example_buffer;
}

static void processReleaseCommand(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    if(nrhs != 2) {
        mexErrMsgTxt("release need 1 params"); 
    } else {
        if(!mxIsUint64(prhs[1])) {
           mexErrMsgTxt("release handle must be UINT64 format");
           return;
        }
        
        int M=mxGetM(prhs[1]); //获得矩阵的行数 
        int N=mxGetN(prhs[1]);  //获得矩阵的列数 
        if((1 != M) &&(1 != N)) {
           mexErrMsgTxt("release handle must be 1*1 array format");
           return; 
        }
        
        long long ip = mxGetScalar(prhs[1]);
        char* example_buffer = (char*)ip;
        free(example_buffer);
        
        //return true avoid warnning
        plhs[0] = mxCreateNumericMatrix(1,1,mxINT8_CLASS,mxREAL);
        char* mx_data = (char *) mxGetData(plhs[0]);
        mx_data[0] = 1;
    }    
}

// Mex entry point
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    // Arguments parsing
    if (nrhs < 1) { mexErrMsgTxt("Not enough input arguments. use 'DspMgr help' for help message."); return; }
    if (!mxIsChar(prhs[0])) { mexErrMsgTxt("First parameter must be a string."); return; }

    // Command selection
    if (commandIs(prhs[0], "HELP")) { processHelpMessageCommand(); }
    else if (commandIs(prhs[0], "init")) { processInitCommand(nlhs, plhs, nrhs, prhs); }
    else if (commandIs(prhs[0], "release")) { processReleaseCommand(nlhs, plhs, nrhs, prhs); }
    else { mexErrMsgTxt("Unknown command or command not implemented yet."); }
}

尤其注意上面例子里我们如何隐藏一个C里申请的指针并传递给Matlab

Matlab的调用例子如下:

mex -output DspMgr 'CFLAGS="\$CFLAGS -std=c99"' '*.c'

v = DspMgr('init')

DspMgr('release',v)

参考链接


泰勒公式

泰勒公式是将一个在x=x0处具有n阶导数的函数f(x)利用关于(x-x0)n次多项式来逼近函数的方法。

若函数f(x)在包含x0的某个闭区间[a,b]上具有n阶导数,且在开区间(a,b)上具有(n+1)阶导数,则对闭区间[a,b]上任意一点x,成立下式:

其中,表示f(x)n阶导数,等号后的多项式称为函数f(x)x0处的泰勒展开式,剩余的Rn(x)是泰勒公式的余项,是(x-x0)n的高阶无穷小。

这里需要注意的是,我们规定0的阶乘 " 0!=1 "

参考链接


常用数学符号希腊字母表

 

希腊字母表
序号
大写
小写
英文注音
国际音标注音
中文读音
意义
1
Α
α
alpha
a:lf
阿尔法
角度;系数
2
Β
β
beta
bet
贝塔
磁通系数;角度;系数
3
Γ
γ
gamma
ga:m
伽马
电导系数(小写)
4
Δ
δ
delta
delt
德尔塔
变动;密度;屈光度
5
Ε
ε
epsilon
ep`silon
艾普西龙
对数之基数
6
Ζ
ζ
zeta
zat
截塔
系数;方位角;阻抗;相对粘度;原子序数
7
Η
η
eta
eit
艾塔
磁滞系数;效率(小写)
8
Θ
θ
thet
θit
西塔
温度;相位角
9
Ι
ι
iot
aiot
约塔
微小,一点儿
10
Κ
κ
kappa
kap
卡帕
介质常数
11
Λ
λ
lambda
lambd
兰布达
波长(小写);体积
12
Μ
μ
mu
mju
磁导系数微(千分之一)放大因数(小写)
13
Ν
ν
nu
nju
磁阻系数
14
Ξ
ξ
xi
ksi
克西
数学上的随机变量
15
Ο
ο
omicron
omik`ron
奥密克戎
16
Π
π
pi
pai
圆周率=圆周÷直径=3.14159 26535 89793
17
Ρ
ρ
rho
rou
电阻系数(小写)
18
Σ
σ
sigma
`sigma
西格马
总和(大写),表面密度;跨导(小写)
19
Τ
τ
tau
tau
时间常数
20
Υ
υ
upsilon
jup`silon
伊普西龙
位移
21
Φ
φ
phi
fai
佛爱
磁通;角
22
Χ
χ
chi
phai
西
23
Ψ
ψ
psi
psai
普西
角速;介质电通量(静电力线);角
24
Ω
ω
omega
o`miga
欧米伽
欧姆(大写);角速(小写);角