Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestFilterBuilder builds invalid XML #491

Open
jnm2 opened this issue Oct 24, 2018 · 13 comments
Open

TestFilterBuilder builds invalid XML #491

jnm2 opened this issue Oct 24, 2018 · 13 comments

Comments

@jnm2
Copy link
Collaborator

jnm2 commented Oct 24, 2018

(Updated)

This implementation mishandles surrogate pairs. Control characters other than tab, carriage return and line feed cannot be represented in XML and should throw, IIRC. (See nunit/nunit3-vs-adapter#484.)

private static string XmlEscape(string text)
{
return text
.Replace("&", "&")
.Replace("\"", """)
.Replace("<", "&lt;")
.Replace(">", "&gt;")
.Replace("'", "&apos;");
}

GetFilter() should be using XmlWriter.Create to a StringWriter. This will not only get us out of the responsibility of implementing the XML spec, it will probably also result in some significant performance gains. (Judging by the chained string.Replace calls.)

(Test names containing control characters cannot be represented in XML without first encoding using base64 or something custom. This will need to be dealt separately with in the framework.)

@nunit/engine-team Does this seem right to you? Can we let @MatthewBeardmore get started on this if he is interested?

@ChrisMaddock
Copy link
Member

Yep, looks good to me!

@jnm2
Copy link
Collaborator Author

jnm2 commented Oct 24, 2018

Wait a second, I remembered wrong. Control chars are totally banned in XML and cannot even be escaped.

This means that test names with control characters cannot be represented in XML and would have to be specially encoded. For example, base64 like we do for binary content.

@jnm2
Copy link
Collaborator Author

jnm2 commented Oct 24, 2018

I'm slowly catching up to where you all are. This issue might depend on nunit/nunit#3063 (comment).

@OsirisTerje
Copy link
Member

Base64 sounds fine to me

@jnm2
Copy link
Collaborator Author

jnm2 commented Oct 24, 2018

@OsirisTerje I'm currently thinking we wouldn't even need additional encoding such as base64 if the framework and adapter ensure that there's no such thing as a test name with control characters in it: nunit/nunit#3063 (comment)

@OsirisTerje
Copy link
Member

I don't think this is up to us to decide - just because we use XML as an intermediate information carrier. The adapter has to accept whatever test name that comes down, and the source based discovery mechanism will use these names for the list that will be passed down to the adapter. They will be part of the FQN - if we can't persuade MS to also ignore these characters, which I doubt. So the only option I can see is that we need to encode them before we use the XML to send them between adapter and engine, and further.

@jnm2
Copy link
Collaborator Author

jnm2 commented Oct 30, 2018

@OsirisTerje I hear what you're saying. We should probably raise this question about parameterized FQNs. However, if NUnit Framework decides that all test names will be prefixed with T- by default, that's a decision I do think is up to the framework. Likewise, if the framework replaces every A with a B in the test names it generates, that's also fine (though nonsensical) so long as the framework is self-consistent.

So why not replace control characters in the name and fullname attributes, and send a base64-encoded fqn attribute to make it possible for the adapter to losslessly interface with VSTest and source-based discovery?

@ChrisMaddock ChrisMaddock added this to the 4.0 milestone Jan 30, 2021
@ChrisMaddock
Copy link
Member

We should address this in v4. I think the easiest way to do this would be to ensure all XML is constructed via built in .NET XML writing/reading classes, rather than things like string replace. It may mean that the engine no longer supports tests named with certain characters.

The framework may need a corresponding change to rename tests which would currently by default be given invalid names.

@CharliePoole
Copy link
Contributor

Agree with re-implementing this but sure about the breaking change.

I'm OK with breaking user-created naming overrides - like using SetName on a test case.

I wouldn't be OK with limiting the use of any naming, which is legal in C# or some other language we support.

@ChrisMaddock
Copy link
Member

There's a lot of moving parts in this issue, but I think Joseph has summed it up quite well here: nunit/nunit#3063 (comment)

Currently, for the example shown at nunit/nunit3-vs-adapter#484, the framework by default gives that a test name which can't legally be represented by XML. This seems to cause unpredictability when the framework/engine/runners all rely on xml to encode their communications.

IMO, the framework should be changed to not create xml-invalid names by default, and prevent users overriding this with SetName. For the engine's part, it should only:

  1. Expect that all messages received from framework(s) are valid XML.
  2. Prevent runners creating content for the framework which cannot be encoding in XML. The only example I can think of here is a runner attempting to create a test filter which can't be encoded in XML, as happened in this situation.

If the framework doesn't create test names as per item 1, then I can't think of any valid need for item 2 - the engine just needs to handle this elegantly.

@CharliePoole
Copy link
Contributor

@jnm2 Do you think that some subset of this issue should be implemented as a 3.16 fix, avoiding the breaking change?

@jnm2
Copy link
Collaborator Author

jnm2 commented Oct 24, 2022

@CharliePoole What would that look like? I reread but I'm not seeing a subset stand out to me as less breaking.

@CharliePoole
Copy link
Contributor

@jnm2 I may be missing something but here's what I was thinking...

There is no valid method name containing control characters. Therefore, no default test name can contain control characters
and such a test name can only be created by the user using SetName. However, such a test name serves no useful purpose
if we can't use it for filtering so why not just eliminate such names by disallowing control characters in SetName?

This might be "breaking" in the sense that you would get an error message rather than just fail to select the test, but so what?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants