# using Makefile
make clean
WHISPER_COREML=1 make -j
当跑起来的时候提示 Core ML 模型正确加载了,就说明成功了:
./main -m models/ggml-large-v3-q5_0.bin ./samples/jfk.wav
...
whisper_init_state: kv cross size = 245.76 MB
whisper_init_state: loading Core ML model from 'models/ggml-large-v3-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size = 8.80 MiB, ( 1490.55 / 12288.02)
usage: whisper-cpp [options] file0.wav file1.wav ...
options:
-h, --help[default] show this help message and exit-t N, --threads N [4 ] number of threads to use during computation
-p N, --processors N [1 ] number of processors to use during computation
-ot N, --offset-t N [0 ]time offset in milliseconds
-on N, --offset-n N [0 ] segment index offset
-d N, --duration N [0 ] duration of audio to process in milliseconds
-mc N, --max-context N [-1] maximum number of text context tokens to store
-ml N, --max-len N [0 ] maximum segment length in characters
-sow, --split-on-word[false]split on word rather than on token
-bo N, --best-of N [5 ] number of best candidates to keep
-bs N, --beam-size N [5 ] beam size for beam search
-wt N, --word-thold N [0.01 ] word timestamp probability threshold
-et N, --entropy-thold N [2.40 ] entropy threshold for decoder fail
-lpt N, --logprob-thold N [-1.00 ] log probability threshold for decoder fail
-debug, --debug-mode[false]enable debug mode (eg. dump log_mel)-tr, --translate[false] translate from source language to english
-di, --diarize[false] stereo audio diarization
-tdrz, --tinydiarize[false]enable tinydiarize (requires a tdrz model)-nf, --no-fallback[false]do not use temperature fallback while decoding
-otxt, --output-txt[false] output result in a text file
-ovtt, --output-vtt[false] output result in a vtt file
-osrt, --output-srt[false] output result in a srt file
-olrc, --output-lrc[false] output result in a lrc file
-owts, --output-words[false] output script for generating karaoke video
-fp, --font-path[/System/Library/Fonts/Supplemental/Courier New Bold.ttf] path to a monospace font for karaoke video
-ocsv, --output-csv[false] output result in a CSV file
-oj, --output-json[false] output result in a JSON file
-ojf, --output-json-full[false] include more information in the JSON file
-of FNAME, --output-file FNAME [] output file path (without file extension)-ps, --print-special[false] print special tokens
-pc, --print-colors[false] print colors
-pp, --print-progress[false] print progress
-nt, --no-timestamps[false]do not print timestamps
-l LANG, --language LANG [en ] spoken language ('auto'for auto-detect)-dl, --detect-language[false]exit after automatically detecting language
--prompt PROMPT [] initial prompt
-m FNAME, --model FNAME [models/ggml-base.en.bin] model path
-f FNAME, --file FNAME [] input WAV file path
-oved D, --ov-e-device DNAME [CPU ] the OpenVINO device used for encode inference
-ls, --log-score[false] log best decoder scores of tokens
-ng, --no-gpu[false] disable GPU