Skip to content

Latest commit

 

History

History
62 lines (43 loc) · 3.02 KB

imagenet-tagging.md

File metadata and controls

62 lines (43 loc) · 3.02 KB

Back | Next | Contents
Image Classification

Multi-Label Classification for Image Tagging

Multi-label classification models are able to recognize multiple object classes simultaneously for performing tasks like image tagging. The multi-label DNNs are almost identical in topology to ordinary single-class models, except they use a sigmoid activation layer as opposed to softmax. There's a pre-trained resnet18-tagging-voc multi-label model available that was trained on the Pascal VOC dataset:

To enable image tagging, you'll want to run imagenet/imagenet.py with --topK=0 and a --threshold of your choosing:

# C++
$ imagenet --model=resnet18-tagging-voc --topK=0 --threshold=0.25 "images/object_*.jpg" images/test/tagging_%i.jpg

# Python
$ imagenet.py --model=resnet18-tagging-voc --topK=0 --threshold=0.25 "images/object_*.jpg" images/test/tagging_%i.jpg

Using --topK=0 means that all the classes with a confidence score exceeding the threshold will be returned by the classifier.

Retrieving Multiple Image Tags from Code

The imageNet.Classify() function will return (classID, confidence) tuples when the topK argument is specified. See imagenet.cpp or imagenet.py for code examples of using multiple classification results:

C++

imageNet::Classifications classifications;	// std::vector<std::pair<uint32_t, float>>  (classID, confidence)

if( net->Classify(image, input->GetWidth(), input->GetHeight(), classifications, topK) < 0 )
	continue;

for( uint32_t n=0; n < classifications.size(); n++ )
{
	const uint32_t classID = classifications[n].first;
	const char* classLabel = net->GetClassLabel(classID);
	const float confidence = classifications[n].second * 100.0f;
	
	printf("imagenet:  %2.5f%% class #%i (%s)\n", confidence, classID, classLabel);	
}

Python

predictions = net.Classify(img, topK=args.topK)

for n, (classID, confidence) in enumerate(predictions):
   classLabel = net.GetClassLabel(classID)
   confidence *= 100.0
   print(f"imagenet:  {confidence:05.2f}% class #{classID} ({classLabel})")

Note that topK can also be used in single-class classification to get the top N results, although those models weren't trained for image tagging.

Next | Detecting Objects from Images
Back | Running the Live Camera Recognition Demo

© 2016-2023 NVIDIA | Table of Contents