A little bit more...
Friday, July 27, 2007
Regular expression and CharSequence in Java
Regular expression, specified as a string, is first compiled into an instance of Pattern. The regular expression constructs supported by Pattern are almost the same (see comparison to Perl 5 in javadoc of class pattern) as that supported by Perl 5. The Pattern engine performs traditional NFA-based (Nondeterministic Finite Automate) matching with ordered alternation as occurs in Perl 5.
The regex Package
Two major class Pattern and Matcher, both of which have non-public constructor which means the client can't new instances of them directly. A Pattern instance is the compiled intermediate representation, plus some helper methods, of the string form regular expression. Instances of Pattern are immutable which means they're thread-safe. And each instance of Matcher, produced by factory method of Pattern, corresponds to each matching between the bound regular expression of the target string ( i.e., the string to be matched). Further information about how to use this package in a program can be found in the resources section.
regex matching in CharSequence
Despite being as part of the java library, the regex package is also used by other parts of the library. These usages are listed below:
boolean String.matches( String regex );
String String.replaceAll( String regex, String replacement )
String String.replaceFirst( String regex, String replacement )
String[] split( String regex )
Note that rather than class String or some other concrete representation of CharSequence, interface CharSequence is widely depended upon both in Pattern and String. Classes implementing CharSequence mainly include String, StringBuffer and StringBuilder. And the implemented methods include such methods that we usually use with String as charAt(int), length(), and subSequence(int, int).
Resources
Wednesday, July 25, 2007
Tips of Using Escape Characters
Cite here: )
It goes back to mechanical teletypes; if you issued a line feed followed by carriage return you got a few characters printed backwards in a diagonal stripe because there was insufficient time for the carriage to move back to the left before it started printing. To overcome this, the carriage return is issued first to give the carriage enough time to return, whilst line feeding, before printing started. It has stuck as a standard ever since.
and here:
Different Operating Systems handle newlines in a different way. Here is a short list of the most common ones:When string are read from user input, escape character can't be explicitly input. For example, in case which string are read from the command line argument, when the user give argument "hello\nworld", the string read in and stored will be "hello\\nworld". This is because escape characters usually are parsed as format presentation rather than literal occurrence (actually, this is the "unescapted" meaning). For the same reason, in the programming codes escape characters, the literal form of format presentation, are used instead of draw a format presentation that the compiler could understand, which is obviously of non-sense.
- DOS and Windows
They expect a newline to be the combination of two characters, namely '\r\n' (or 13 followed by 10).
- Unix (and hence Linux as well)
Unix uses a single '\n' to indicate a new line.
- Mac
Macs use a single '\r'.
Note that using escape characters to specify the format presentation is not the definition of escape character, though in many cases they're used in this way.
Tuesday, July 24, 2007
Tips on XML Spec
| [10] | AttValue | ::= | '"' ([^<&"] | Reference)* '"' |
| "'" ([^<&'] | Reference)* "'" |
e.g., <xxx yyy="..."..."> incurs errors because of the extra double-quote character, instead, <xxx yyy="..."..."> can be used.
Resources:
Thursday, July 05, 2007
TDS: Tabular Data Stream Protocol
Resources:TDS is a protocol, a set of rules describing how to transmit data between two computers. Like any protocol, it defines the types of messages that can be sent, and the order in which they may be sent. Protocols describe the "bits on the wire", how data flow.
In reading this manual, it may be helpful to keep in mind that a protocol is not an API, although the two are related. The server recognizes and speaks a protocol; anything that can send it the correct combination of bytes in the right order can communicate with it. But programmers aren't generally in the business of sending bytes; that's the job of a library. Over the years, there have been a few libraries — each with its own API — that do the work of moving SQL through a TDS pipe. ODBC, db-lib, ct-lib, and JDBC have very different APIs, but they're all one to the server, because on the wire they speak TDS .
The TDS protocol was designed and developed by Sybase Inc. for their Sybase SQL Server relational database engine in 1984. The problem Sybase faced then still exists: There was no commonly accepted application-level protocol to transfer data between a database server and its client. To encourage the use of their product, Sybase came up with a flexible pair of products called netlib and db-lib.
netlib's job was to ferry data between the two computers. To do that, it had to deal with the underlying network protocol. Remember, in those days TCP/IP was not the ubiquitous thing it is today. Besides TCP/IP, netlib ran on DECnet, IPX/SPX, NetBEUI and the like.
db-lib provided an API to the client program, and communicated with the server via netlib. What db-lib sent to the server took the form of a stream of bytes, a structured stream of bytes meant for tables of data, a Tabular Data Stream.
In 1990 Sybase entered into a technology sharing agreement with Microsoft which resulted in Microsoft marketing its own SQL Server. Microsoft kept the db-lib API and added ODBC. (Microsoft has since added other APIs, too.) At about the same time, Sybase introduced a more powerful "successor" to db-lib, called ct-lib, and called the pair OpenClient.
ct-lib, db-lib, and ODBC are APIs that — however different their programming style may be — all use netlib to communicate to the server. The language they use is TDS.
The TDS protocol comes in several flavors, most of which have never been openly documented. If anything, it's probably considered to be something like a trade secret, or at least proprietary technology. The exception is TDS 5.0, used exclusively by Sybase, for which documentation is available from Sybase.
Test publishing from email
Test publishing from email
Test publishing from email
Test publishing from email
Test publishing from email
Test publishing from email
Links
About Me
- Kenyth
- I'm finishing my master degree in Software Engineering, Computer Science. I believe and have been following what Forrest Gump's Mam said: you have to do the best with what god gave you.