logo
down
shadow

OpenCV Cuda "invalid device function" on first cuda call


OpenCV Cuda "invalid device function" on first cuda call

By : Marilyn Gutierrez
Date : November 20 2020, 03:01 PM
it helps some times to @RobertCrovella for providing the correct answer. The problem was solved by simply adding 6.1 to the CUDA_ARCH_BIN list in CMAKE. So what I ended up using was CUDA_ARCH_BIN = 5.0, 5.2, 6.0, 6.1 (since I'm only interested in Maxwell and Pascal) and I left CUDA_GENERATION empty. If you select something for CUDA_GENERATION, it automatically fills in CUDA_ARCH_BIN for you...and for me, it gave me more than I wanted.
Side note: I noticed that the more architectures you add to CUDA_ARCH_BIN, the larger the OpenCV dlls became. Which supports exactly what Robert was saying in his comments. It appears that for every architecture in the list, specific code for that architecture is added to the dll. If you don't put an arch in the list, the code will not run on that arch.
code :


Share : facebook icon twitter icon
CUDA can no longer copy data from device to host after a "bad" call to a function

CUDA can no longer copy data from device to host after a "bad" call to a function


By : Chirag Kalra
Date : March 29 2020, 07:55 AM
To fix this issue Your problem is that x_host and y_host are pointers to host memory spaces. The __global__ add function expects pointers to device memory space. As you have constructed your code, add will wrongly interpret x_host and y_host as device memory pointers.
As noticed by Farzad, your could have been spotting the mistake by yourself through a proper CUDA error checking in the sense of What is the canonical way to check for errors using the CUDA runtime API?.
code :
#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }
inline void gpuAssert(cudaError_t code, char *file, int line, bool abort=true)
{
    if (code != cudaSuccess) 
    {
        fprintf(stderr,"GPUassert: %s %s %d\n", cudaGetErrorString(code), file, line);
        if (abort) { exit(code); getchar(); }
    }
}

__global__ void add(int *a, int *b, int *c)
{
    *c = *a - *b;
}

int main(void)
{
    int* x_host = (int*)malloc(sizeof(int));
    int* y_host = (int*)malloc(sizeof(int));

    *x_host = 8;
    *y_host = 4;

    int* tempGPU;   gpuErrchk(cudaMalloc((void**)&tempGPU,sizeof(int)));
    int* x_dev;     gpuErrchk(cudaMalloc((void**)&x_dev,  sizeof(int)));
    int* y_dev;     gpuErrchk(cudaMalloc((void**)&y_dev,  sizeof(int)));

    gpuErrchk(cudaMemcpy(x_dev, x_host, sizeof(int), cudaMemcpyHostToDevice));
    gpuErrchk(cudaMemcpy(y_dev, y_host, sizeof(int), cudaMemcpyHostToDevice));

    int result; 

    add<<<1,1>>> (x_dev, y_dev, tempGPU);
    gpuErrchk(cudaPeekAtLastError());
    gpuErrchk(cudaDeviceSynchronize());

    gpuErrchk(cudaMemcpy(&result, tempGPU, sizeof(int), cudaMemcpyDeviceToHost));

    printf("\n x_host - y_host = %d\n", result);

    gpuErrchk(cudaFree(x_dev));
    gpuErrchk(cudaFree(y_dev));
    gpuErrchk(cudaFree(tempGPU));

    getchar();

    return 0;

}
Use cmake to configure cuda project for vs2013 and get "invalid device function" error

Use cmake to configure cuda project for vs2013 and get "invalid device function" error


By : Gaël Bréard
Date : March 29 2020, 07:55 AM
hop of those help? I use the cmake gui tool to configure my cuda project in vs2013. CMakeLists.txt is as below: , This is a problem:
code :
set(CUDA_NVCC_FLAGS -gencode arch=compute_20,code=sm_20;-G;-g)
set(CUDA_NVCC_FLAGS -gencode arch=compute_52,code=sm_52;-G;-g)
CAFFE: Cuda Error "(8 vs. 0) invalid device function" when using GPU (GeForce GTX 970)?

CAFFE: Cuda Error "(8 vs. 0) invalid device function" when using GPU (GeForce GTX 970)?


By : Руслан Мицкевич
Date : March 29 2020, 07:55 AM
hope this fix your issue Whenever the CUDA runtime API returns "Invalid Device Function", it means you are using code which wasn't built for the architecture you are trying to run it on (and doesn't have a JIT path).
You probably need to check your CAFFE Makefile.config to make sure it is set for the correct architecture, by making sure that CUDA_ARCH includes -gencode arch=compute_52,code=compute_52.
CMake + CUDA "invalid device function" even with correct SM version

CMake + CUDA "invalid device function" even with correct SM version


By : user3185767
Date : March 29 2020, 07:55 AM
wish help you to fix your issue Ultimately, as expected, this was due to a build system setup problem.
TLDR version:
code :
cuda_select_nvcc_arch_flags(ARCH_FLAGS Auto)
list(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})
    add_library(cuda-interop SHARED [c++ only code])
    file(GLOB cuda_SOURCES "modules/cudainterop/cuda/*.cu")
    # the library that only has the cuda code
    add_library(cuda-interop-cuda SHARED ${cuda_SOURCES})
    set_target_properties(cuda-interop-cuda PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
    set_target_properties(cuda-interop-cuda PROPERTIES POSITION_INDEPENDENT_CODE ON)
    target_link_libraries(cuda-interop PRIVATE cuda-interop-cuda)
cuda file error "Invalid device function"

cuda file error "Invalid device function"


By : Chearful
Date : March 29 2020, 07:55 AM
like below fixes the issue GTX 295 has compute capability 1.3 I believe. It may be worth checking your solution compiler settings to see whether you are not compiling the solution using something like compute_20,sm_20. If so, try to change these values to e.g. compute_10,sm_10, rebuild and see whether it helps. See here for details on setting these values.
EDIT:
shadow
Privacy Policy - Terms - Contact Us © voile276.org