Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据集拆分成训练集和测试集 #4

Open
FLyingLSJ opened this issue Jun 19, 2020 · 3 comments
Open

数据集拆分成训练集和测试集 #4

FLyingLSJ opened this issue Jun 19, 2020 · 3 comments

Comments

@FLyingLSJ
Copy link

请教一下大佬,这个数据集拆分成训练集和测试集有没有相关代码啊

@a769302434
Copy link

import os
import random

trainval_percent = 0.8
train_percent = 0.8
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets\Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')

for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
可以试试这个

@FLyingLSJ
Copy link
Author

FLyingLSJ commented Jul 2, 2020

这个拆分数据应该跟具体的数据有关,需要保证分布均衡吧,大佬在知乎上https://zhuanlan.zhihu.com/p/129842491 也介绍了挺多的,但是好像没有看到具体的样本划分。
image

@a769302434
Copy link

有道理 顺便问一下大佬有没有跑通这个代码了 我刚刚接触mmdetection这个框架,不是很了解,想要向您咨询一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants