[haiku-bugs] Re: [Haiku] #13002: Parse URLs Without Requiring Group-Capable Regex
- From: "apl-haiku" <trac@xxxxxxxxxxxx>
- Date: Mon, 17 Oct 2016 10:28:14 -0000
#13002: Parse URLs Without Requiring Group-Capable Regex
----------------------------------+--------------------------------
Reporter: apl-haiku | Owner: nobody
Type: enhancement | Status: new
Priority: normal | Milestone: Unscheduled
Component: Network & Internet | Version: R1/Development
Resolution: | Keywords: URL BUrl regex Mac
Blocked By: | Blocking:
Has a Patch: 1 | Platform: All
----------------------------------+--------------------------------
Comment (by apl-haiku):
1) Parsing URL is very tricky. The regexp as given by the RFC is
the reference and we should try to stick with it.
I agree that parsing URLs is not easy and (without using a specific regex
engine) I have used the regex as a guide in this change.
2) I don't think it is a good idea to work around problems in other
operating systems.
This is true in that the larger project is not about portability. However
reducing the complexity of the cross-compile 'touch point' with other
operating systems is a benefit.
3) It leads to a lot more code to test and debug.
This is true, but I have also added a number of additional unit tests.
These tests should hopefully provide some level of safety. If there are
some more problems then additional test-cases can be added.
4) There was code similar to this before I implemented the regexp
parsing, and it was broken.
This is true and yes any change is undeniably a risk. I notice that your
change was, in part, triggered by data URLs and so I have now added a unit
test for this (passed OK) and updated the patch on this ticket.
5) What is the impact on performance? Is it faster or slower than the
regexp way?
This is a good question which I had not considered. To check this I wrote
a small program which parses a URL 1 million times containing the major
components first creating and then destroying a BUrl object. The URL is;
```
http://loop:pea@xxxxxxxxxxxxxxxxxxx:8888/some/path?key=value#frag
```
For the old implementation this takes circa 31s user-time and on the new
implementation circa ~8s user-time. So the new implementation in this
ticket is approximately 3x faster. This is on a VM running on a ~2009 mac
laptop.
----
In any case; over to you for a decision. The updated patch is also on the
ticket if somebody wants to take it up later.
--
Ticket URL: <
https://dev.haiku-os.org/ticket/13002#comment:4>
Haiku <
https://dev.haiku-os.org>
Haiku - the operating system.
Other related posts: