This is to deal with a special issue in the video descriptor xml output.
The value of the transcript attribute of the video tag is a serialized
json string. This can have comparison problems in python3 where the order
of the dictionary output is not gauranteed to be the same. So the strings
don't match equally. We convert the parsed json instead so that the
comparison can be correct.