URL URI

简介

URL统一资源定位符;URI统一资源标识符。URI是比较新的概念,包含URL(for example 身份证是URI,家庭住址及时URI有时URL),而且功能上更为强大,包含自动编码功能。html

URL的组成

协议:主机号(端口号):文件路径信息:query参数:参考位置
相对URL 绝对URL。相对URL是协议、主机号(端口)借用基url。以下,用基URL表示两个URLpage1.html;page2.html。参考位置文件所在的当前文件夹。git

http://example.com/pages/page1.html
http://example.com/pages/page2.html
URL myURL = new URL("http://example.com/pages/");
URL page1URL = new URL(myURL, "page1.html;page2.html。");
URL page2URL = new URL(myURL, "page2.html");
复制代码


URL的地址不能包含一些特殊字符不能进行自动编码与解码,须要借助URI。bash

URL的接口函数介绍

  1. 初始化函数,主要初始化的时候,会初始化URLStreamHandler的底层实现,一个底层接口,若是没有特殊指定系统会调用默认的。
  2. 常规的URL解释函数,调用各个部分
  3. 打开流部分,与打开链接部分

URI介绍

组成与分类

组成通常分为三个部分
[scheme:]scheme-specific-part[#fragment]
相对uri与绝对url。分层url与不透明uri 不透明url只能是决定url。协议后面的scheme-specific-part不易/开头。 层次uri,例如httpuri有决定于相对之分。相对url的地址开头不易/开头函数


url的操做方法:ui

  1. 标准化,去除. ..地址,
  2. 解析:合并绝对地址与相对地址
  3. 相对化:求取相对地址

uri字符的组成

介绍以下:编码

  1. 字符
  2. 数字
  3. 字符数字
  4. 菲保留字符:字符数字+ “_-!.~'()*”
  5. 符号 ,;:$&+=
  6. 保留字符 符号+ “”"?/[]@"
  7. unicode字符不包含:控制字符、空格
RFC 2396 specifies precisely which characters are permitted in the various components of a URI reference. The following categories, most of which are taken from that specification, are used below to describe these constraints:
    alpha	The US-ASCII alphabetic characters, 'A' through 'Z' and 'a' through 'z'
    digit	The US-ASCII decimal digit characters, '0' through '9'
    alphanum	All alpha and digit characters
    unreserved    	All alphanum characters together with those in the string "_-!.~'()*"
    punct	The characters in the string ",;:$&+="
    reserved	All punct characters together with those in the string "?/[]@"
    escaped	Escaped octets, that is, triplets consisting of the percent character ('%') followed by two hexadecimal digits ('0'-'9', 'A'-'F', and 'a'-'f')
    other	The Unicode characters that are not in the US-ASCII character set, are not control characters (according to the Character.isISOControl method), and are not space characters (according to the Character.isSpaceChar method)  (Deviation from RFC 2396, which is limited to US-ASCII)
The set of all legal URI characters consists of the unreserved, reserved, escaped, and other characters.
复制代码

uri编码

缘由:有其余非unicode编码;非法字符入如空格。 一、uri的构造函数单个参数的时候,非法字符必须为引用,用转义字符表示 2. 多个参数的构造函数,直接采用原来的模式 3. get解析构造中的转义字符 4. getraw*直接输出,不解析转义字符 5. he toString method returns a URI string with all necessary quotation but which may contain other characters. 6. The toASCIIString method returns a fully quoted and encoded URI string that does not contain any other characters.url


uri的功能都是对url补充,URI没有底层实现,没有handle接口函数等等。例如你在url中协议写为htp的时候,会报错,可是uri不会。uri会进行一些检查,例如空格,url不会。spa