Jetson TX1 TK1 - yszheda/wiki GitHub Wiki
- https://developer.nvidia.com/embedded/develop/hardware
- https://developer.nvidia.com/embedded/linux-tegra-archive
- NVIDIA Jetson TK1学习与开发(七):图文详解OpenCV在Jetson TK1上的安装和使用
- TX1刷机教程(安装caffe、cuda/cudnn)
- Jetson TX1 开发教程(1)--配置与刷机
- Jetson TX1/TX2配置教程--拷贝离线安装包
- 史上最全Jetson TX1使用介绍(干货 福利,不买慎入)
- Jetson TX1/TX2配置教程
Architecture
CUDA
Caffe
version
cat /etc/nv_tegra_release
Performance
TX1
sudo su
echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo userspace > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo userspace > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo userspace > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq
cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu2/cpufreq/scaling_min_freq
cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq > /sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq
echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
for file in /sys/devices/system/cpu/cpu*/online; do
if [ `cat $file` -eq 0 ]; then
echo 1 > $file
fi
done
echo runnable > /sys/devices/system/cpu/cpuquiet/current_governor
cat /sys/kernel/debug/clock/gpu_dvfs_t
cat /sys/kernel/debug/clock/dvfs_table
cat /sys/kernel/debug/clock/gbus/max >
/sys/kernel/debug/clock/override.gbus/rate
echo 1 > /sys/kernel/debug/clock/override.gbus/state
TK1
# Maximizing CPU performance
echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Controlling GPU performance
echo 852000000 > /sys/kernel/debug/clock/override.gbus/rate
echo 1 > /sys/kernel/debug/clock/override.gbus/state
Development
zero-copy memory
Jetson TK1 supports the complete CUDA Toolkit version 6.0. Tegra K1 supports Unified Memory, however in contrast to current desktop / server GPUs, the memory on Tegra is physically unified. However, there are separate GPU and CPU caches. This just means that you need to use the
cudaMallocManaged
API to allocate memory on Tegra K1, just like you do on Tesla and GeForce; you have the same programming model across all GPUs.
On Tegra, GPU and CPU allocate memory from the same hardware. The main difference is in sync and cache handling.
Sync:
- Unified: auto-sync via GPU driver
- Zero-copy: pinned memory, but may have slow access on some location.
Cache:
- Unified: YES
- Zero-copy: NO
We recommend Jetson user to use unified memory, and more information can be found here: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-introduction
Regarding the article http://arrayfire.com/zero-copy-on-tegra-k1/ from 2014 stating that zero-copy is faster than cudaMalloc, this article is mis-leading and generalizes the zero-copy case. This is not really accurate.
Zero copy is only faster in some cases where the access pattern does not benefit from caches.
Zero-Copy memory on Tegra is CPU and GPU uncached. So every access by the CUDA kernel goes to DRAM. So if the kernel repeatedly accesses the same memory location from then it is likely that the cudaMalloc memory is faster.
cudaHostRegister()
is not supported on ARM platforms. This is because the caching attribute of an existing allocation can't be changed on the fly.If required, please use
cudaHostAlloc()
with the flagcudaHostAllocMapped
to allocate device-mapped host-accessible memory.
Unified Memory
From https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements
K.1.1. System Requirements Unified Memory has two basic requirements:
- a GPU with SM architecture 3.0 or higher (Kepler class or newer)
- a 64-bit host application and non-embedded operating system (Linux, Windows, macOS)
- Data transfer between CPU and GPU on Jetson TK1?
- GPU Direct RDMA for Jetson TK1
- Jetson Tegra TX1 Shared Memory
OpenCV
Trouble shooting
CUDA driver version is insufficient for CUDA runtime version
monitor
sudo ~/tegrastats
sudo ~/jetson_clocks.sh --show
-
nvidia-smi command not found... I'm obviously missing something
-
Tx2 with jetpack 3.1 , GPU utilization issue for video encoding MSENC/NVENC
status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR
需用root执行
TK1n安装libopencv4tegra
Depends: libavcodec54 (>= 6:9.1-1) but it is not installable or
libavcodec-extra-54 (>= 6:9.16) but it is not installable
Depends: libavformat54 (>= 6:9.1-1) but it is not installable
Depends: libavutil52 (>= 6:9.1-1) but it is not installable
Depends: libswscale2 (>= 6:9.1-1) but it is not installable
E: Unable to correct problems, you have held broken packages.
sudo apt-add-repository universe