TensorRT开源软件
此存储库包含NVIDIA TensorRT的开源软件(OSS)组件。其中包括TensorRT插件和解析器(Caffe和ONNX)的源代码,以及演示TensorRT平台使用和功能的示例应用程序。这些开源软件组件是TensorRT General Availability(GA)发行版的一个子集,其中包含一些扩展和错误修复。
对于TensorRT OSS的代码贡献,请参阅我们的贡献指南和编码指南。
有关TensorRT OSS发行版附带的新添加和更新的摘要,请参阅变更日志。
Build
Prerequistites
要构建TensorRT OSS组件,首先需要以下软件包。
参考链接:https://github.com/NVIDIA/TensorRT
TensorRT GA build
TensorRT v7.2.1
See Downloading TensorRT Builds for details
System Packages
CUDA
Recommended versions:
cuda-11.1 + cuDNN-8.0
cuda-11.0 + cuDNN-8.0
cuda-10.2 + cuDNN-8.0
GNU make >= v4.1
cmake >= v3.13
python >= v3.6.5
pip >= v19.0
Essential utilities
git, pkg-config, wget, zlib
Optional Packages
Containerized build
Docker >= 19.03
NVIDIA Container Toolkit
Toolchains and SDKs
(Cross compilation for Jetson platform) NVIDIA JetPack
>= 4.4
(For Windows builds) Visual
Studio 2017 Community or Enterprise edition
(Cross compilation for QNX platform) QNX Toolchain
PyPI packages (for demo applications/tests)
numpy
onnx 1.6.0
onnxruntime >= 1.3.0
pytest
tensorflow-gpu 1.15.4
Code formatting tools (for contributors)
Clang-format
Git-clang-format
NOTE: onnx-tensorrt,
cub, and protobuf packages
are downloaded along with TensorRT OSS, and not required to be installed.
Downloading TensorRT Build
- Download TensorRT OSS
On Linux: Bash
git clone -b master https://github.com/nvidia/TensorRT TensorRTcd TensorRTgit submodule update --init --recursiveexport TRT_SOURCE=pwd
On Windows: Powershell
git clone -b master https://github.com/nvidia/TensorRT TensorRTcd TensorRTgit submodule update --init --recursive$Env:TRT_SOURCE = $(Get-Location)
- Download TensorRT GA
To build TensorRT OSS, obtain the corresponding TensorRT GA build from NVIDIA Developer Zone.
Example: Ubuntu 18.04 on x86-64 with cuda-11.1
Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.1
cd ~/Downloadstar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.1.cudnn8.0.tar.gzexport TRT_RELEASE=pwd
/TensorRT-7.2.1.6
Example: Ubuntu 18.04 on PowerPC with cuda-11.0
Download and extract the latest TensorRT 7.2.1 GA package for Ubuntu 18.04 and CUDA 11.0
cd ~/Downloadstar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.powerpc64le-gnu.cuda-11.0.cudnn8.0.tar.gzexport TRT_RELEASE=pwd
/TensorRT-7.2.1.6
Example: CentOS/RedHat 7 on x86-64 with cuda-11.0
Download and extract the TensorRT 7.2.1 GA for CentOS/RedHat 7 and CUDA 11.0 tar package
cd ~/Downloadstar -xvzf TensorRT-7.2.1.6.CentOS-7.6.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gzexport TRT_RELEASE=pwd
/TensorRT-7.2.1.6
Example: Ubuntu18.04 Cross-Compile for QNX with cuda-10.2
Download and extract the TensorRT 7.2.1 GA for QNX and CUDA 10.2 tar package
cd ~/Downloadstar -xvzf TensorRT-7.2.1.6.Ubuntu-18.04.aarch64-qnx.cuda-10.2.cudnn7.6.tar.gzexport TRT_RELEASE=pwd
/TensorRT-7.2.1.6export QNX_HOST=//host/linux/x86_64export QNX_TARGET=//target/qnx7
Example: Windows on x86-64 with cuda-11.0
Download and extract the TensorRT 7.2.1 GA for Windows and CUDA 11.0 zip package and add msbuild
to PATH
cd ~\DownloadsExpand-Archive .\TensorRT-7.2.1.6.Windows10.x86_64.cuda-11.0.cudnn8.0.zip E n v : T R T R E L E A S E = ′ Env:TRT_RELEASE = ' Env:TRTRELEASE=′(Get-Location)\TensorRT-7.2.1.6’$Env:PATH += ‘C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin’
- (Optional) JetPack SDK for Jetson builds
Using the JetPack SDK manager, download the host components. Steps:
i. Download and launch the SDK manager. Login with your developer account.
ii. Select the platform and target OS (example: Jetson AGX Xavier, Linux Jetpack 4.4), and click Continue.
iii. Under Download & Install Options change the download folder and select Download now, Install later. Agree to the license terms and click Continue.
iv. Move the extracted files into the $TRT_SOURCE/docker/jetpack_files folder.
Setting Up The Build Environment
For native builds, install the prerequisite System Packages.
Alternatively (recommended for non-Windows builds), install Docker and generate a build container as described below:
- Generate the TensorRT-OSS build container.
The TensorRT-OSS build container can be generated using the Dockerfiles and build script included with TensorRT-OSS. The build container is bundled with packages and environment required for building TensorRT OSS.
Example: Ubuntu 18.04
on x86-64 with cuda-11.1
./docker/build.sh --file docker/ubuntu.Dockerfile --tag tensorrt-ubuntu --os 18.04 --cuda 11.1
Example: Ubuntu 18.04 on PowerPC with cuda-11.0
./docker/build.sh --file docker/ubuntu-cross-ppc64le.Dockerfile --tag tensorrt-ubuntu-ppc --os 18.04 --cuda 11.0
Example: CentOS/RedHat 7 on x86-64 with cuda-11.0
./docker/build.sh --file docker/centos.Dockerfile --tag tensorrt-centos --os 7 --cuda 11.0
Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)
./docker/build.sh --file docker/ubuntu-cross-aarch64.Dockerfile --tag tensorrt-cross-jetpack --os 18.04 --cuda 10.2
- Launch the TensorRT-OSS build container.
Example: Ubuntu 18.04 build container
./docker/launch.sh --tag tensorrt-ubuntu --gpus all --release $TRT_RELEASE --source $TRT_SOURCE
NOTE:
i. Use the tag
corresponding to the build container you generated in
ii. To run TensorRT/CUDA
programs in the build container, install NVIDIA Container
Toolkit. Docker versions < 19.03 require nvidia-docker2 and --runtime=nvidia flag for docker run commands. On versions >= 19.03, you need the nvidia-container-toolkit package and --gpus all flag.
Building TensorRT-OSS
· Generate Makefiles or VS project (Windows) and build.
Example: Linux (x86-64) build with default cuda-11.1
cd KaTeX parse error: Expected 'EOF', got '&' at position 27: …mkdir -p build &̲& cd build cmak…TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out make -j$(nproc)
Example: Native build on Jetson (arm64) with cuda-10.2
cd KaTeX parse error: Expected 'EOF', got '&' at position 26: …mkdir -p build &̲& cd buildcmake…TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2make -j$(nproc)
Example: Ubuntu 18.04 Cross-Compile for Jetson (arm64) with cuda-10.2 (JetPack)
cd KaTeX parse error: Expected 'EOF', got '&' at position 27: …mkdir -p build &̲& cd build cmak…TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DCMAKE_TOOLCHAIN_FILE=
T
R
T
S
O
U
R
C
E
/
c
m
a
k
e
/
t
o
o
l
c
h
a
i
n
s
/
c
m
a
k
e
a
a
r
c
h
64.
t
o
o
l
c
h
a
i
n
−
D
C
U
D
A
V
E
R
S
I
O
N
=
10.2
m
a
k
e
−
j
TRT_SOURCE/cmake/toolchains/cmake_aarch64.toolchain -DCUDA_VERSION=10.2 make -j
TRTSOURCE/cmake/toolchains/cmakeaarch64.toolchain−DCUDAVERSION=10.2make−j(nproc)
Example: Cross-Compile for QNX with cuda-10.2
cd KaTeX parse error: Expected 'EOF', got '&' at position 27: …mkdir -p build &̲& cd build cmak…TRT_RELEASE/lib -DTRT_OUT_DIR=pwd
/out -DCMAKE_TOOLCHAIN_FILE=
T
R
T
S
O
U
R
C
E
/
c
m
a
k
e
/
t
o
o
l
c
h
a
i
n
s
/
c
m
a
k
e
q
n
x
.
t
o
o
l
c
h
a
i
n
−
D
C
U
D
A
V
E
R
S
I
O
N
=
10.2
m
a
k
e
−
j
TRT_SOURCE/cmake/toolchains/cmake_qnx.toolchain -DCUDA_VERSION=10.2 make -j
TRTSOURCE/cmake/toolchains/cmakeqnx.toolchain−DCUDAVERSION=10.2make−j(nproc)
Example: Windows (x86-64) build in Powershell
cd E n v : T R T S O U R C E m k d i r − p b u i l d ; c d b u i l d c m a k e . . − D T R T L I B D I R = Env:TRT_SOURCE mkdir -p build ; cd build cmake .. -DTRT_LIB_DIR= Env:TRTSOURCEmkdir−pbuild;cdbuildcmake..−DTRTLIBDIR=Env:TRT_RELEASE\lib -DTRT_OUT_DIR=’$(Get-Location)\out’ -DCMAKE_TOOLCHAIN_FILE=…\cmake\toolchains\cmake_x64_win.toolchain msbuild ALL_BUILD.vcxproj
NOTE:
The default CUDA version used by CMake is 11.1. To override this, for example to 10.2, append -DCUDA_VERSION=10.2 to the cmake command.
If samples fail to link on CentOS7, create this symbolic link: ln -s $TRT_OUT_DIR/libnvinfer_plugin.so $TRT_OUT_DIR/libnvinfer_plugin.so.7
·
Required CMake build arguments are:
TRT_LIB_DIR: Path to the TensorRT installation directory containing libraries.
TRT_OUT_DIR: Output directory where generated build artifacts will be copied.
·
Optional CMake build
arguments:
CMAKE_BUILD_TYPE: Specify if binaries generated are for release or debug (contain debug symbols). Values consists of [Release] | Debug
CUDA_VERISON: The version of CUDA to target, for example [11.1].
CUDNN_VERSION: The version of cuDNN to target, for example [8.0].
NVCR_SUFFIX: Optional nvcr/cuda image suffix. Set to
“-rc” for CUDA11 RC builds until general availability. Blank by default.
PROTOBUF_VERSION: The version of Protobuf to use, for example [3.0.0]. Note: Changing this will not configure CMake to use a system version of Protobuf, it will configure CMake to download and try building that version.
CMAKE_TOOLCHAIN_FILE: The path to a toolchain file for cross compilation.
BUILD_PARSERS: Specify if the parsers should be built, for example [ON] | OFF. If turned OFF, CMake will try to find precompiled versions of the parser libraries to use in compiling samples. First in ${TRT_LIB_DIR}, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.
BUILD_PLUGINS: Specify if the plugins should be built, for
example [ON] | OFF. If turned OFF, CMake will try to find a
precompiled version of the plugin library to use in compiling samples.
First in ${TRT_LIB_DIR}, then on the system. If the build type is Debug, then it will prefer debug builds of the libraries before release versions if available.
BUILD_SAMPLES: Specify if the samples should be built, for example [ON] | OFF. CUB_VERSION: The version of CUB to use, for example [1.8.0].
GPU_ARCHS: GPU (SM) architectures to target. By default we generate CUDA code for all major SMs. Specific SM versions can be specified here as a quoted space-separated list to reduce compilation time and binary size. Table of compute capabilities of NVIDIA GPUs can be found here.
Examples:
NVidia A100: -DGPU_ARCHS=“80”
Tesla T4, GeForce RTX 2080: -DGPU_ARCHS=“75”
Titan V, Tesla V100: -DGPU_ARCHS=“70”
Multiple SMs: -DGPU_ARCHS=“80 75”
TRT_PLATFORM_ID: Bare-metal build (unlike containerized cross-compilation) on non Linux/x86 platforms must explicitly specify the target platform. Currently supported options: x86_64 (default), aarch64
(Optional) Install TensorRT python bindings
·
The TensorRT python API bindings must be installed for running TensorRT python applications
Example: install TensorRT wheel for python 3.6
pip3 install $TRT_RELEASE/python/tensorrt-7.2.1.6-cp36-none-linux_x86_64.whl
References
TensorRT Resources
TensorRT Homepage TensorRT
Developer Guide TensorRT
Sample Support Guide
TensorRT Discussion Forums
TensorRT Release Notes.
Known Issues
TensorRT 7.2.1
None