Network performance measures depend on the problem. If the network has to perform a classification task it is common to calculate the error as a percentage of correct classifications. It is possible to tolerate quite high errors in the output activations. If the network has to match a smooth function it may be most sensible to calculate the RMS error over all output units etc.
The most sensible way to progress is to save the output activations together with target values for the test data and to write a little program that does whatever testing is required. The files under are just the ticket: Note that the output patterns are always saved. The 'include output patterns' actually means 'include target (!) patterns'.