Frequently Asked Questions

Do I need to attend the ICDAR 2015 conference?

No, you don't. Attendance to the ICDAR conference is not mandatory to participate or to win the competition. However, it is highly encouraged, since it gathers many researchers around the world, and some of them will present works related to keyword spotting.

Do I have to write a paper explaining my method?

No, but you must include a short description of it in your submission. We will include the explanation of your method, based on that description, in the report of the competition, that will be included in the ICDAR 2015 proceedings.

Do I need to participate in all tasks/assignments?

No. You can submit your solution to any of the four assignments. Of course, the more assignments you participate in, the better!

Can I win in a track if I only complete one assignment?

Yes, it is just more difficult. The score that you get in each track is mainly based on your best performance among the two assignments in the track.

Because we wanted to encourage researchers from different backgrounds to participate, and encourage methods able to deal with different scenarios, we give additional points based on your performance in the second assigment, where your solution is not so good. See additional details in the evaluation section.

Is there any limit regarding the number of submissions?

No. The only limit regarding the submissions is that, once you make a submission, you must wait 30 minutes before the next one. This is to prevent high traffic and load in our evaluation server.

Your final score will be based in your best submission in each assignment.

Is there any limit regarding the size of my solution files?

Yes. In order to avoid high incoming traffic to the evaluation server, we have limited the solution files to 16MB.

This limit should be more than enough for your solution files, for the current test data. However, if you think that this limit is penalizing you somehow, you can compress your files using Gzip, and appending the ".gz" extension to the filename.

What must be the format of my solution file?

Please, read carefully the evaluation section first.

Your solution file must be a plain txt file containing multiple lines, one for each of your matches.

Each of these lines must have 6 fields, at least:

  1. Document ID (for segmentation-free assignments) or the segmented word ID (for segmentation-based assignments).
  2. Query image ID (Query-by-Example) or the query string (Query-by-String).
  3. Four fields encoding the bounding box of the match in the document image, given as Left-X, Top-Y, Width and Height. Realize that for the segmentation-based assignment, (X, Y) is always (0, 0) and the width and height are the those of the segmented word image.

Additional fields are ignored: for instance, you may include explicitly your confidence in the match, but this will not be used by the evaluation software.

It is very important that the lines are sorted properly: by decreasing confidence (the most confident match, should be placed first).

Notice that we use the mean Average Precision as our evaluation metric. It is computed from a ranked list of results for each query. So, you may also give the file locally sorted for each query, and this will give you exactly the same mAP as if you sort your file globally. However, we suggest to give a global order since other metrics (like AP) are sensible to this and it may hinder posterior analysis.

What should we do with hyphenated words?

We consider an hyphenated word as two separate "keywords", so you should not treat them specially. For instance, if the word "action" is hyphenated into "ac-" and "-tion", you should spot the "ac-" region only for the query "AC-", the "-tion" region should be only spotted for the query "-TION", and none of them should be retrieved for the complete keyword "ACTION".

We decided to handle hypthenation this way for both simplicity in the ground-truth and because dealing with the appropriate context would be extreme difficult in the segmentation-based scenario, where no context information is available.

Where is the template for the description of our submission?

There is no template for the description. The only restriction regarding the description text, is that it has to be 1000 - 5000 characters long.

The submission form has a mandatory text area field that you will have to fill with plain text, once you submit your solution. If you need to include any picture or attach any document, please copy the URL. If you need to write some mathematical formula, we'd appreciate if you write it using LaTeX syntax.

Please, realize that we have to summarize all the systems participating in the contest in the paper that we the organizers must submit to the ICDAR Competition Chairs. Thus, we cannot afford high-detailed explanations of each individual solution.

Will you tell us the score of the baseline algorithm for each track?

Yes. Once you make a valid submission to a certain assignment, we will tell you what is the baseline mAP and whether your mAP is higher or lower than it.

Notice that we won't give the ground truth until the end of the competition, so you won't be able to compute any mAP score by yourself until then.

Do you have any other question?

Please, mail it to joapuipe_AT_prhlt_DOT_upv_DOT_es and we will answer it and post it here for others.